castwide / solargraph

A Ruby language server.
https://solargraph.org
MIT License
1.9k stars 159 forks source link

Solargraph reading 500GB/h #9

Closed MatthiasWinkelmann closed 6 years ago

MatthiasWinkelmann commented 7 years ago

I ran into this slight problem using solargraph and felt I should mention it... I noticed it was constantly maxing out the hard disk (SSD's) read bandwidth and had read almost 700GB within about two hours.

I'm afraid it had been doing this for weeks and I didn't notice. These SSDs are just too quiet. At that rate, it would kill any sort of battery performance, and may even have consequences for the SSD's lifespan.

I'm pretty sure it was busy exploring a nested archive of about 2 million small JPGs in about as many folders.

See screenshot... I'm also adding the output of ps ax } grep <pid> and lost in case it helps. screen shot 2017-09-29 at 18 18 05 solargraph-lsof-psax.txt

castwide commented 7 years ago

Thanks for the report. Solargraph shouldn't bother to open binary files at all, let alone try to parse them. I'll fast-track a solution to this one.

castwide commented 7 years ago

@MatthiasWinkelmann, can you tell me what folder you opened in VS Code? For example, was it a folder for a particular project, or was it something bigger, like your home directory? Solargraph assumes that the open folder represents a common workspace, so it'll browse the entire folder for Ruby scripts to add to its maps.

This presents a conflict in use cases. On one hand, you might open a Ruby project's folder in VS Code. In that case, it makes sense to map all of its Ruby scripts, e.g., so autocompletion suggests classes defined in the /lib folder when you're editing a file in the /app folder.

On the other hand, you might open a folder that doesn't represent a single project, simply because you want the convenience of using VS Code's explorer to browse it. In that case, you don't want Solargraph to spend resources mapping the whole directory. Worse, if you open a low-level folder like your home directory, it can find your installed gems and try to map those as well.

I have a couple ideas for solving this problem, but it would help if you could confirm my assumptions or let me know more about what you had open in VS Code.

MatthiasWinkelmann commented 7 years ago

I was in a ruby project. That project includes a 'data' folder with a few million small jpegs. I have /data in the .gitignore, data in VSCode's config files.exclude and "data/**": true in files.watcherExclude.

I'm not firm on CSCode's internals, but it appears to me that, short of maintaining an additional list, files.watcherExclude may come closest to what's needed: .gitignore usually lists library files, which shouldn't be excluded. Same may be true for files.exclude. But the file watcher is probably supposed to pick up on changed libraries. Then again, node_modules is in the default set for that list.

castwide commented 7 years ago

Thanks, @MatthiasWinkelmann.

I think one of the major issues here is the scope and frequency of workspace updates. Modifying a file triggers a complete update of the workspace's ApiMap and yardoc. This can be unnecessarily intensive, especially with large projects.

The coming updates to the Solargraph gem and VS Code extension include the following changes:

Not only should these changes drastically reduce the amount of file I/O required to update maps, it should make suggestion requests faster by spending less time processing ASTs.

castwide commented 7 years ago

It looks like the new ApiMap will eliminate the need to maintain a local yardoc. It retains a complete map of the current workspace and supports single-file updates.

castwide commented 7 years ago

Gem version 0.13.0 and extension version 0.9.0 are released. The gem no longer generates yardocs in open workspaces and is capable of updating individual files without reloading the entire project.