Closed hermannsblum closed 5 years ago
Thanks for the suggestion.
This actually existed in an earlier version, but our current parsing technique of markup -> HTML
didn't fit very well with LaTeX (there were a lot of syntax-related false positives). I'd like to bring this back eventually, though (especially since I do most of my writing in LaTeX, too).
For the reference, i think for my main use cases, a simple script that I could hand-craft for preprocessing LaTeX should be enough to eliminate most of the parsing issues. The only problem is that exact locations get lost in that process.
Another reference: vale works with latex document at the moment and the only downside is the false-positives in listings, macros, and comments. So It might be easier to just double check an error if it resides in such environments, which can be done using regular expressions.
Unfortunately, there are also a lot of false-negatives as all rules set for a specific scope like paragraph
get ignored.
I think this is something that I'm going to close for now, as I was unable to implement a means of handling LaTeX well: ignoring certain sections is doable with regular expressions (as mentioned above), but supporting Vale's markup-related scopes (e.g., heading
, paragraph
, blockquote
, etc.) is difficult without an AST to traverse.
Some ideas that could be explored, though, are using Pandoc's AST or a library like syntect to create one.
No, my comment above (https://github.com/errata-ai/vale/issues/54#issuecomment-356004908) still addresses this issue: Vale needs an AST-like structure to traverse in order to "support" a format.
I have an idea, maybe it could be useful. I realized that when I use LanguageTool from within TeXstudio, LanguageTool does not parse the Latex-related commands/environments. Note that "LanguageTool does not support stripping TeX formatting markup", so it must be TeXstudio which does it. Indeed, this header file defines the LanguageTool interface and this directory is responsible for parsing Latex. I am not a computer scientist, so I don't know if this is an AST-like structure you need.
Would this help? https://github.com/stepmuel/tex2ast
Especially for academic writing this tool would be awesome, but with the only scope beeing text the styleguides are not very useful.