davidlday / vscode-languagetool-linter

A from scratch redesign of LanguageTool integration for VS Code.
https://marketplace.visualstudio.com/items?itemName=davidlday.languagetool-linter
Apache License 2.0
149 stars 11 forks source link

Huge Memory and CPU consumption when linting very big files #324

Open RSWilli opened 3 years ago

RSWilli commented 3 years ago

Describe the bug I opened a big CSV file in VSCode (442122 Lines) and Languagetool linter decided to send the whole file to my local languagetool

Languagetool then consumed all my system resources and will not stop until killed via sudo

To Reproduce Steps to reproduce the behavior:

  1. Open a big file

Expected behavior Languagetool linter should either only send the visible portion of the window, or have a config setting where I can configure the maximum size sent to languagetool which, when exceeded, will not send anything

Screenshots image

Environment & Configuration (please complete the following information):

RSWilli commented 3 years ago

A possible fix would be to extend https://github.com/davidlday/vscode-languagetool-linter/blob/4fd1563a89a528ddd0a35dd599742d3372d963ad/src/ConfigurationManager.ts#L128

to support maximum file length

davidlday commented 3 years ago

Thanks for opening this issue. Totally agree this needs addressed. I have a few thoughts to share.

Most immediately, LanguageTool server already provides a way to limit requests based on size via a Java properties file:

$ languagetool-server --help
Usage: HTTPServer [--config propertyFile] [--port|-p port] [--public]
  --config FILE  a Java property file (one key=value entry per line) with values for:
                 'maxTextLength' - maximum text length, longer texts will cause an error (optional)
                 'maxTextHardLength' - maximum text length, applies even to users with a special secret 'token' parameter (optional)
...

There are many other options in there, but those two apply to this situation. Since the LanguageTool service could be called by other processes, I would (and do) set the option on the service.

That said, I definitely think there's still a problem here, with a few different approaches off the top of my head:

My gut says to:

  1. help make users who are configuring LT externally aware of LanguageTool configuration options
  2. add an option for the managed service for specifying a config file

These are my first thoughts, so I'm open to other suggestions.