erlang-ls / erlang_ls

The Erlang Language Server
https://erlang-ls.github.io/
Apache License 2.0
628 stars 136 forks source link

Erlang-LS runs thousands of Erlang processes at start-up when plt_path is defined #931

Open seriyps opened 3 years ago

seriyps commented 3 years ago

Describe the bug Maybe not really a bug, but I noticed that Erlang-LS starts a lot of Erlang processes at startup (under els_background_jobs), maybe thousands, and they all are competing for CPU heavily and each one has 100+ MB heap: image

I think this is something Dialyzer related, I checked them via observer and they are mainly busy doing some md5 calculations somewhere in dialyzer code.

image

And those processes finish eventually.

While it's generally ok to have thousands of processes (they will hung in scheduler's queues doing nothing), but I think it leads to high RAM usage at startup (about 5-8GB) and lots of scheduler context switches (when process runs out of reductions).

To Reproduce Just start new Erlang LS session for a medium-sized project. I have plt_path: in my config, but haven't checked what would happen without it. Can try if you think it's important it indeed only happens when plt_path is in the config. No such problem if I remove/coment plt_path.

UPD: it might be because I automatically reopen multiple files from the same project when I restart my emacs - "desktop mde". See comment.

Expected behavior I honestly think having a fixed pool of workers and a task queue would be a bit more efficient (both CPU and RAM). UPD: if it's indeed multiple dialyzer:run jobs running at the same time, I think erlang_ls should allow only one dialyzer job running at a time, because dialyzer:run spawns many processes by itself.

Actual behavior 1000+ Erlang processes competing for CPU and using lots of RAM

Context

seriyps commented 3 years ago

Looking at the code, it mightbe that those are spawned by Dialyzer,

https://github.com/erlang-ls/erlang_ls/blob/23c1acadd190f2d6e47165bee41c83a494b713c2/apps/els_lsp/src/els_dialyzer_diagnostics.erl#L47-L52

but I see dialyzer sets a limit on the number of parallel jobs to 20 schedulers_online, which in my case should be no more than 20 4 = 80.

https://github.com/erlang/otp/blob/b8cc6444878959dff71f55e10637d27e7106dfb7/lib/dialyzer/src/dialyzer_coordinator.erl#L122

Maybe erlang_ls launches multiple dialyzer runs at the same time?

I haven't mentioned it before, but I use Emacs "desktop mode" (it works the same way as browser's "restore open tabs when you restart your browser"), so, when I restart my emacs it could open multiple .erl files from the same project in background.

robertoaloi commented 3 years ago

Thanks for the report @seriyps . Indeed something we need to look at.

robertoaloi commented 3 years ago

@seriyps A dialyzer:run/1 is invoked on open and save. With your desktop mode I guess this means a parallel run for each buffer. You can look at els_diagnostics:run_diagnostics/1 and specifically at els_dialyzer_diagnostics/1. Pooling in this case sounds like a good idea. Also, I'm working on #899 and that seems to make things worse because it runs typer (which runs dialyzer) on open.

seriyps commented 3 years ago

And the dialyzer run itself seems to be not that big of a deal itself. But it looks like each dialyzer:run starts to validate plt, which takes the most time. Maybe we can only run "validate plt" step once? Let me try to add no_check_plt option...

seriyps commented 3 years ago

Ok, with this diff erlang_ls went OOM pretty quickly

diff --git a/apps/els_lsp/src/els_dialyzer_diagnostics.erl b/apps/els_lsp/src/els_dialyzer_diagnostics.erl
index 6552f31..9176bb9 100644
--- a/apps/els_lsp/src/els_dialyzer_diagnostics.erl
+++ b/apps/els_lsp/src/els_dialyzer_diagnostics.erl
@@ -49,6 +49,7 @@ run(Uri) ->
                             , {include_dirs, els_config:get(include_paths)}
                             , {plts, [DialyzerPltPath]}
                             , {defines, defines()}
+                            , {check_plt, false}
                             ])
            catch Type:Error ->
                ?LOG_ERROR( "Error while running dialyzer [type=~p] [error=~p]"

I guess it skipped plt-check step, but then started smth like 20 "type check" steps in parallel, and those particular steps require quite a lot of memory.

seriyps commented 3 years ago

Ok, I made "desktop mode" to only reopen like 10 buffers and tried to restart with and without {check_plt, false}. With check_plt=false the initial burst was smth like 2-3x shorter.

I think it might make sense to do dialyser:run with {check_plt, true} only once at start time (or afger significant changes in the repository) and after the initial validation only call (from on_open/on_save) dialyzer:run with {check_plt, false} - it will be 2x-3x faster (depends on the size of PLT ofcourse).

Also, not run more than 1-2 {check_plt, true} jobs at a time (and probably even the number of runs with {check_plt, false} should be limited, because dialyzer uses lots of RAM and starts worker processes under the hood).

Not sure how important it is. Not many people are using "desktop mode" though. But I find it quite handy.