languagetool-language-server / vscode-languagetool

LanguageTool grammar checking for Visual Studio Code.
Apache License 2.0
68 stars 7 forks source link

.tex file (Support for LaTeX) #7

Open ghost opened 7 years ago

ghost commented 7 years ago

Hi @adamvoss

I am really happy with your extension. I am trying to use vs code as a latex suite but somehow your extension isnt checking .tex files.

Other files work fine

bildschirmfoto vom 2017-06-25 22-54-03 bildschirmfoto vom 2017-06-25 22-55-03

adamvoss commented 7 years ago

I looked into supporting .tex files when I was first developing this. The problem is the mix of syntax and natural language text. LaTeX is not simple to parse and reportedly you'd need to implement the whole language just to be able to successfully distinguish between markup and final content. I looked at Texlipse as suggested there, but it is defunct. For LanguageTool solutions for LaTeX, you may want to check out: http://wiki.languagetool.org/checking-la-tex-with-languagetool

The ability to parse LaTeX is important so only the natural language pieces are assembled to sent to LanguageTool. While the Markdown implementation still has some rough edges where it can be improved, it uses a full-fledged Markdown parser to be able to tell what is Markdown syntax versus actual content to avoid LanguageTool giving false positives because of syntax.

If someone implements a suitable parser, I'd be happy to add support to this extension.

The other option would be to let users ignore the lower quality checking by not using a parser. In the lower right-hand corner you can click the language and manually change your document to "Plain Text". You will lose your syntax highlighting, but it will let you see how LanguageTool would respond to your LaTeX document. If you find the behavior acceptable, I could add configuration allowing LanguageTool to run against LaTeX documents (which would let you keep the syntax highlighting).

jambonmcyeah commented 6 years ago

Try if you can integrate this into your extension: https://github.com/pkubowicz/opendetex

adamvoss commented 6 years ago

@jambonmcyeah That looks like an interesting project; thanks for bringing it up! Unfortunately it appears to only be able to output the plain text. In order to work for this we need both the plain text and some way to map the plain text back to the position in the TeX document so we know where to underline when there are issues.

ForNeVeR commented 6 years ago

There's also a latex-parser project. It looks to me like it shouldn't be hard to try build a proof-of-concept based on it?

@adamvoss could you please take a look at that project? I'd like to help with the task if that's ever possible.

adamvoss commented 6 years ago

@ForNeVeR Looks like it would have limitations but could provide somewhat of a solution. Since the language server is a Java project and that library is JavaScript some sort of interop out be needed. Any idea what that story would look like. I half expect to be as far ahead porting the TS/JS code to Java as trying to interop between the two languages, but I'll be happy to be shown I'm wrong.

ForNeVeR commented 6 years ago

@adamvoss thanks for your explanations. I'll take a look at the language server code and search for other possible solutions.

AKuederle commented 6 years ago

I personally think, that even without explicitly parsing the tex syntax, language tool provides useful output (of course with a lot of false positives). Hence, I would argue the extension should be enabled by default for .tex files (or in the longterm, just provide a settings option to specify for which file types it should be active)

TiemenSch commented 6 years ago

Perhaps it's useful to see how other VSCode extensions handle this for TeX files? Spellright add-on works great with TeX files for spelling, but it lacks any grammar checks.

https://github.com/bartosz-antosik/vscode-spellright

Sadly, I lack any Java(Script) / TS knowledge to dive into this.

vmassuchetto commented 6 years ago

Would be really useful for .rst files as well. It makes a lot of sense to just enable users to select which files they want to spellcheck. Better parsing can come later.

RedTailBullet commented 6 years ago

What I think is that we might be able to "ignore" the syntax from .tex? By matching the beginning of the syntax and ignore the syntax part. Or what you really need is a way to get the plaintext out from a .tex file?

maikol-solis commented 5 years ago

@adamvoss As a workaround, you could simply throw all the tex file (text and latex syntax) into the languagetool server and the user has to decide what parts are worth to review. This way works in emacs and other editors and for me is just fine.

nicolafio commented 5 years ago

Hear me out. I have an unwarranted wild idea for you guys.

Considering that LaTeX is:

  1. a nightmare to parse;
  2. a programming language.

Then why not solve this creatively? Ask the user to tell what to check.

How?

By writing your own LaTeX package, and naming it languagetool or however you please, and then implement a command/environment that is named, for example, ltprose, where everything inside it is considered as a paragraph and given to the LanguageTool server, and then, when the project is being built, write the suggestions, warnings, and errors in a plain-text/JSON/XML file in the same folder of the source.

It follows that the file can be read easily by the Visual Studio Code extension so it can put relevant information in the "Problems" panel of the editor. This approach also makes it easier to write extensions for other editors as well.

I never wrote LaTeX scripts before, but, if it's a serious programming language, I suppose you can also discern the line and the column where ltprose has been invoked.

So the idea is the following:

\usepackage{languagetool}
\begin{document}
\ltprose{Hi! I'm going to be checked for grammar mistakes!}
\end{document}

This enables to specify parameters as well, for example:

  1. \usepackage[en-GB]{languagetool} for setting a global language.
  2. \ltprose[en-GB]{God save the Queen.} for setting a language to just a piece of text.

I could help by contributing to the implementation of this solution, if you want, and if I feel like being selfless for once.

Gee, I need some coffee.

ForNeVeR commented 5 years ago

@JD342 well that's a very interesting idea.

Could we make it fully automatic? E.g. our plugin would automatically add \usepackage{languagetool} to the start of the main .tex file, execute mklatex and grab the output, and then somehow present the results in the editor?

nicolafio commented 5 years ago

@ForNeVeR I suppose you know about an extension named LaTeX Workshop for Visual Studio Code. If you have it installed, then once you save the source, things get automatically compiled.

If you use that extension, then for vscode-languagetool there would only be the need to look for changes in that plain-text/JSON/XML file that gets generated by the proposed languagetool package. The idea is for the package to talk with the LanguageTool server, not this extension. This extension would end up being a "brainless monkey" that puts the already digested results in the editor.

In the way I'm picturing it, the plugin shouldn't touch the source. \usepackage{languagetool} should be added by the user and the user should designate what pieces of text need to be checked by using ltprose or something. This cannot be automatic because there may be pieces that shouldn't be checked: diagrams and formulas, for example, or whatever strange thing that the user comes up with. Then the user has the possibility to look at the grammar mistakes by reading the generated file himself, or, more conveniently, let this extension read it for him and present the results nicely inside the editor.

The workflow would be as follows:

  1. Install LaTeX Workshop and vscode-languagetool from Visual Studio Marketplace, languagetool from CTAN (for example).

  2. When working on .tex files, hit Ctrl+S and boom, there goes your self-esteem as Visual Studio logs all your silly mistakes.

ForNeVeR commented 5 years ago

I would still prefer if we could do it automatically without any user actions.

For example, I would like to sometimes check my thesis and articles with the tool, but usually I don't want my LaTeX documents to contain any references to the tools I'm using for spell checking.

nicolafio commented 5 years ago

You make a valid point.

This, unfortunately, conflicts with the rationale of the solution I proposed: the "ask the user what to parse" part.

On another note, it's not uncommon to include references of tools being used. For example, people - including myself - put stuff in JavaScript projects to help the ESLint syntax checker discern what to do. Even if it's agreeably not a nice and clean solution, sometimes the language is so complex that certain things need to be explicitly stated. LaTeX is by no means simple; it's on par with the other programming languages.

However, I think that your proposed approach is a very good idea. Maybe it can be possible to get the source, sneakily inject the package, and, instead of implementing the proposed ltprose, redefine the standard commands and environments, so that they can extrapolate the paragraphs but keep their default behaviour.

\begin{document}

Hey! I'm going to be checked for mistakes because the environment `document'
has been redefined under the hood to detect paragraphs such as me!

\begin{displaymath}
I am not going to be checked for mistakes because the displaymath environment
is not relevant and the sneaky injected package knows it!
\end{displaymath}

\end{document}

This is just an idea, though. I currently don't know how to implement it; I'm not that smart. I'd need to read up on the advanced features of LaTeX. We might be opening a can of worms without realising it.

ForNeVeR commented 5 years ago

Well, I think that ideally we could do it both ways: for people who like to set our spellchecker up (e.g. tell it what to parse) we should allow them to do it. And for people who want to "just" do the spell check we could add an automatic mode. They're not so different actually.

konn commented 5 years ago

There might still be other formats to be supported in LangTool. I think adding official support for each such format will get cumbersome sooner or later. And, as for LaTeX, as the discussion up to here shows, there is no canonical way to check LaTeX with LanguageTool and can be several possible ways for it.

So, how about providing general API to configure and call LanguageTool linter with arbitrary de-formatted input?

davidlday commented 5 years ago

Please see #25.