jxmorris12 / language_tool_python

a free python grammar checker 📝✅
GNU General Public License v3.0
432 stars 64 forks source link

Extracting language from source code #7

Closed rubdos closed 4 years ago

rubdos commented 4 years ago

Citing https://github.com/myint/language-check/issues/29, since you asked me to maybe port that issue to your fork:

Cfr. vim-syntastic/syntastic/issues/1918

When using syntastic with language-check on source files, program code is obviously parsed as language, and obviously has a lot of grammatical mistakes.

I know this is a big one, but would it be interesting to add some experimental support to language-check to parse source files? This would be especially useful for LaTeX, as that contains a lot of text. I also think it is useful to parse eg. C, C++ and Java code, and check comments for grammatical and spelling mistakes.

What do you think?

Now, your README doesn't state anything about syntastic anymore, so question one would be:

If so, a second:

@myint had a good comment in the original issue about this, I'd love to read how they manage these things in the more modern 2020's. It's been four years, after all!

jxmorris12 commented 4 years ago

Hey @rubdos, yeah, I remember reading your issue. My fork doesn't support syntastic, but certainly could. I did some digging and found that language_check integrated with synastic in 2015 (file here) but it may not work anymore. I bet you could adapt this checker style for 2020 using language_tool_python without too much effort. I'm happy to support you if you'd like to try and get an integration with syntactic setup. (Then we can add it to our README here, too 🙂)

About spell-checking source code: this is an awesome extension of language_tool_python and I encourage it! I think you could accomplish this pretty simply by combining language_tool_python with this comment_parser PyPI package. Use comment_parser to extract the comments from source file(s) and run them through language_tool_python to check their grammar. I'm happy to help you get started with that sort of thing, too.

Thanks for your interest and please let me know how you'd like to proceed!

rubdos commented 4 years ago

A simple integration indeed shouldn't be too difficult, judging from that syntastic file. However I believe many people have moved away from it to things like coc.nvim and family, because of the LSP integration.

OTOH: Grammar checking source code with comment_parser sounds really interesting. For LaTeX, one can use detex. Maybe your package can provide an option --source-code or a separate entry-point for running against source code? From there it should be easy enough to integrate with vim through the quickfix feature; making a simple command à la :make to call language tool and display the output in quickfix should get a very clean integration, albeit not as clean as the integrated :spell yet.

Sadly, I'm only a vim user and I'm not very experienced in vimscript and writing plugins, so I have no clue about deeper integration (though that would be something like syntastic indeed).

I'd love to say I'd sink some time in this, but I'm pretty busy as-is. If you'd be interested in getting comment_parser integrated, I'd suggest I take a shot at integrating detex after that. What do you think?

jxmorris12 commented 4 years ago

Cool. @rubdos I can certainly add a source-code option, if you'd like to mess around with a plugin.