Closed myitcv closed 1 year ago
Thanks for the report. Possibly related: https://github.com/golang/vscode-go/issues/2889.
The CL you bisected to wouldn't have slowed down type checking. However, it would have resulted in many more completion candidates being considered. Before that CL it was possible that gopls miss some struct fields in the current type being selected (!). After that CL we should have no such omissions, but may also pull in additional candidates. The 20x slowdown you observe is likely due to parsing cost for building this candidate list; that cost is mitigated the second time around due to the parse cache.
This was on my radar due to https://github.com/golang/vscode-go/issues/2889, and is among our highest priorities to address.
Based on my current understanding, you should only experience this in certain packages, and should only experience the 20x slowdown at startup, after which subsequent completions should be faster (though perhaps not as fast as before). Can you confirm if that is the case?
CC @adonovan
Thanks.
Do the numbers in the table above not answer your question re subsequent completions?
Note also the slow speeds even with a daemon running which suggests another bug?
Do the numbers in the table above not answer your question re subsequent completions?
Sorry, I mean subsequent completions for other things, not just that one specific source location; during editing do you generally experience a 20x slowdown in completion, or 2.5x slowdown? The table suggests 2.5x, but I wanted to be sure.
Note also the slow speeds even with a daemon running which suggests another bug?
Not a bug, but an oversight: the parse-cache is per-session, not per instance -- the parse cache is meant to be a minor optimization, not the massive optimization that is experienced here. This is actually strong evidence that the parse cache is involved.
Interestingly, the bug does not immediately reproduce for me in neovim or vs code (I see completions around 100ms), though it does reproduce for Alan. We're investigating why.
Sorry, I mean subsequent completions for other things, not just that one specific source location; during editing do you generally experience a 20x slowdown in completion, or 2.5x slowdown? The table suggests 2.5x, but I wanted to be sure.
Truth be told I had to revert to the fast version because I was getting too frustrated :)
So unfortunately I don't have any evidence or examples beyond the analysis presented here.
So, we have new evidence, but we lost our reproducer and could use your help.
On Alan's machine, we started gopls with -debug=localhost:8099
and visited http://localhost:8099/trace. There, we observed (by clicking on textDocument/completion
) that the longest completion was ~2.6s. Scrolling down to look at the trace, we saw that ~2.5s of that was in "runProcessEnvFunc"
.
This is the unimportedCompletion logic scraping the module cache. Unfortunately, we deleted our backup of the huge mod cache, and can no longer reproduce :facepalm:.
Could you do the following, using gopls@tip (or at the culprit)?
"completionBudget": "1ms"
to see if it makes a difference (EDIT: not 0s, as 0 => infinity)"completeUnimported": false
to see if it makes a differenceSorry to enlist you, but absent a reproducer this is hard to track down. The steps above will help confirm our theory.
So, we have new evidence, but we lost our reproducer and could use your help.
Great, happy to help.
All of my response is based on analysis using https://github.com/golang/tools/commit/866a6b0ff32a.
- start gopls with -debug=localhost:8099
- reproduce, and look at http://localhost:8099/trace. Please share the "Longest" trace for completion, which you can see by clicking on textDocument/completion and scrolling down
This is what I get. Although for some reason I don't have the "runProcessEnvFunc"
span details:
05:40:11.400 end textDocument/completion (+2.073471033s) method="textDocument/completion" direction="in" id="#2"
05:40:11.400 event (+2.073470075s) label= status.code="OK"
05:40:09.327 start queued
05:40:09.327 end queued (+51.833µs)
05:40:09.327 start textDocument/completion method="textDocument/completion" direction="out" id="#3"
05:40:11.400 end textDocument/completion (+2.072909241s) method="textDocument/completion" direction="out" id="#3"
05:40:11.400 event (+2.073332116s) label= status.code="OK"
- try setting
"completionBudget": "1ms"
to see if it makes a difference (EDIT: not 0s, as 0 => infinity)
Setting to "1ms" did not influence the completion time:
[Trace - 05:43:19.927 AM] Received response 'textDocument/completion - (2)' in 2026ms.
This feels like a separate bug?
- try setting
"completeUnimported": false
to see if it makes a difference
Bingo (note I reverted the completionBudget
setting, i.e. no value set in my config):
[Trace - 05:44:32.173 AM] Received response 'textDocument/completion - (2)' in 24ms.
Which triggers an interesting thought for me.
I would absolutely not have expected the edit and completion that I'm performing in this situation to have triggered unimported completions. Because they are so expensive, I would only have expected them to trigger in the following scenario:
blah._
with the cursor at the position _
, where blah
is calculated to be unresolved in the current scope and hence it is likely that blah.
is an attempt to start and now complete a qualified identifier.
Otherwise, a trivial completion attempt in a "hello world" example:
package main
func main() {
str_
}
takes ~2000ms from cold and ~200ms warm (same session, ignoring the daemon point for now).
Re questions about the VM setup, last time I checked my disk performance through VMWare Fusion was faster than native macOS. Incidentally, you might find that macOS (and windows?) users have been disproportionately reporting more speed problems if unimported completions are dominating as we think they are, especially where those users have large module caches. Might also be worth asking people to report the size of their mod cache for reference, unless you could also lazily compute and report this in log files.
Here is mine:
$ du -sh $(go env GOMODCACHE)
36G /home/myitcv/gostuff/pkg/mod
$ du -sh $(go env GOMODCACHE)/cache
7.9G /home/myitcv/gostuff/pkg/mod/cache
(because as I understand it the cache
subdirectory does not participate in the unimported scan, because it contains zip files and bare git trees).
In summary from my perspective:
gopls
instances: https://github.com/golang/go/issues/47615. This lack of caching between sessions in the presence of a daemon is, I guess, also the likely cause of the timings I've seen above?completionBudget
is not honoured, per above.Thanks for your response. The bug is understood now. Because of the change to how completion budget is implemented, a check for context cancellation that would have prevented unimported completion from running was eliminated.
The unfortunate/problematic thing is that this existed for multiple months without being caught. We do have several benchmarks for completion, but they were insensitive to this change because they don't capture the initial cost and because they operate in a fresh module cache. We've seen a few reports of slow completion our survey results and issues, and I had it on my list to revisit before the next release, but it would have been hard to hone in on this without your clear reproducer. Thank you again.
I'll note that this is a perfect example of where telemetry would have highlighted the regression -- we would have seen the regression in the release, and may even have root-caused it quickly by observing that it affected darwin more than linux.
A few comments:
You can expect that this will be fixed, and soon. In addition to simply fixing the cancellation logic, there appears to be a lot of low-hanging fruit that can make completion significantly faster.
The complicated part is not the fix itself, but rather the process that led to it.
Thank you for such a wonderfully detailed, comprehensive response. Delighted to help in whatever small way I can, even if that is just trying different scenarios etc.
Change https://go.dev/cl/530015 mentions this issue: gopls/internal/regtest/bench: add additional completion benchmarks
Change https://go.dev/cl/530016 mentions this issue: gopls/internal/lsp/source/completion: start timing before type checking
Change https://go.dev/cl/530215 mentions this issue: gopls: instrument telemetry for latency of important operations
Change https://go.dev/cl/530017 mentions this issue: gopls/internal/lsp/source/completion: fixes for completion budget
@myitcv I believe gopls@master should now have similar latency to what you observed in April. Please let me know if it doesn't.
However, I'm not closing this issue because there appeared to be a lot of low-hanging fruit to further optimize significantly. For example, I analyzed one of our completion benchmarks on Kubernetes, and observed that type checking was only 65ms, while completion took 170ms. I think we can get completion latency much closer to type checking latency.
Coincidentally, the potential optimizations I see are almost identical to what we did for workspace/symbol requests:
I believe gopls@master should now have similar latency to what you observed in April. Please let me know if it doesn't.
Thanks, just took a look.
Still seeing similar numbers to before (https://github.com/golang/tools/commit/e2393aba883a) actually. ~20x slow down for the initial completion, and ~2.5x slow down for "warm" completions:
golang.org/x/tools Commit |
Cold start, no daemon | Same completion, repeated after cold start | Fresh session with daemon running from previous session |
---|---|---|---|
https://github.com/golang/tools/commit/ac2946029 | 102ms | 102ms | 102ms |
https://github.com/golang/tools/commit/fb7463ac1de5 | 2032ms | 243ms | 2069ms |
Do I need specific settings to make this "work"?
(I tried setting completionBudget = 100ms
but that did not affect things)
One of the big pain points here is that ~2s at startup, with or without a daemon already running. Vim is just one of those tools that you open and close the whole time, so the daemon-based setup is critical.
That said, the ~230ms completion is very noticeable vs ~100ms too.
Coincidentally, the potential optimizations I see are almost identical to what we did for workspace/symbol requests:
Did my suggestions in https://github.com/golang/go/issues/62665#issuecomment-1722142899 also make sense with regards to optimisations? Because whilst a different kind of optimisation, it feels like we want to avoid unimported completions as much as possible, and when "forced" to include them ensure that we have warm caches in the daemon (as you previously observed).
Change https://go.dev/cl/530599 mentions this issue: gopls: set a context deadline after minimal completion results
@myitcv can you try again at tip? Sorry to make you keep trying, but I can't reproduce and want to make sure we've successfully rolled back the regression before including additional optimizations.
Did my suggestions in https://github.com/golang/go/issues/62665#issuecomment-1722142899 also make sense with regards to optimisations? Because whilst a different kind of optimisation, it feels like we want to avoid unimported completions as much as possible, and when "forced" to include them ensure that we have warm caches in the daemon (as you previously observed).
You mean the suggestion not to run unimported completions in various contexts where it doesn't make sense? That's definitely a good idea, but I've concluded that it's not at the heart of the problem you observe (see below).
Upon staring at the code and experimenting, I've drawn the following conclusions:
"deepCompletion": false
is set) is generally pretty fast following type checking.Therefore, I think we should close this bug once we've verified a fix for the regression, and open new bugs for various completion logic improvements, which would constitute a larger project.
Good news! Tested against https://github.com/golang/tools/commit/4b34fbf5f10d and we appear to be back in business!
golang.org/x/tools Commit |
Cold start, no daemon | Same completion, repeated after cold start | Fresh session with daemon running from previous session |
---|---|---|---|
https://github.com/golang/tools/commit/ac2946029 | 102ms | 102ms | 102ms |
https://github.com/golang/tools/commit/fb7463ac1de5 | 2032ms | 243ms | 2069ms |
https://github.com/golang/tools/commit/4b34fbf5f10d | 102ms | 102ms | 102ms |
Thanks very much for digging into this, @findleyr.
I'll leave you to close the issue or keep open as you see fit.
Fantastic, thanks for confirming. I've opened #63263 to follow up on deep completion performance.
I was sat next to @myitcv earlier when he upgraded gopls and gave it a try - you should have seen his grin from ear to ear!
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
The line starts as:
Change that to:
Then:
(with the cursor in the
_
position, then attempt completion.What did you expect to see?
Fast results.
What did you see instead?
Slow results.
The following table shows some pretty accurate wall time numbers for the completion call returning. For reference, my machine is an M1 Max 64GB memory, running Linux via VMWare Fusion.
golang.org/x/tools
CommitLog files for each result from the first column of timings:
I have bisected this slowdown to https://github.com/golang/tools/commit/41e4e5654988.
This appears to highlight a couple of things to me:
cc @findleyr