golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.33k stars 17.7k forks source link

x/tools/gopls: improvements to loading behavior #68002

Open findleyr opened 5 months ago

findleyr commented 5 months ago

This issue aggregates some work I'd like to do to refine the go/packages.Load calls made by gopls. Generally speaking, these fall into the category of "using gopls' knowledge of the workspace state to refine loading patterns". This should benefit users across the board, but may in particular help for users of go/packages drivers that have different performance characteristics than the go command.

A word on terminology: we refer to the packages data returned by go/packages.Load as package metadata.

Considering only the simple case of a module workspace, metadata loading works approximately as follows (some edge cases are ignored). All loads are inclusive of -deps.

  1. On startup, load <modulePath>/...
  2. When a file change affects metadata, expand that change to all "possibly affected" packages, which includes
    • all packages containing the file
    • if the package clause changed, all packages with files in the same directory as the file
    • all packages that have import cycles (it's too hard to determine whether the change could have resolved cycles)
  3. When metadata is invalidated for a package, also invalidate its importers, recursively (the reverse transitive closure)
  4. When processing a file-oriented request such as completion, if we have no package metadata containing that file, load a file= query (but keep track of files that can't be loaded so we don't keep trying).
  5. When processing a workspace-wide request such as references, load all the packages invalidated by steps (2) and (3) with a query containing all the invalidated package paths.

This loading pattern evolved to ensure we don't miss metadata changes, and can operate on a file as soon as possible. Notably, it was only chosen based on the go command behavior, and generally works because go list is heavily optimized. Even loading a relatively large project such as Kubernetes takes only a few seconds.

However, there are some improvements to be desired:

We should experiment with these improvements. I believe they should be of particular help for users of a bazel go/packages drivers, as bazel queries have significantly higher overhead than the go command.

CC @adonovan @JamyDev

gabyhelp commented 5 months ago

Similar Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

li-jin-gou commented 4 months ago

I'm responsible for a system that generates a lot of rpc automated code that exists as a remote package that is slow to load and slow to compile. I'm currently working on a GoPackageDriver to automatically generate the code locally at the user's site, which would help a lot if the speed could be increased. ❤️

gopherbot commented 2 months ago

Change https://go.dev/cl/611843 mentions this issue: gopls: join concurrent package batch operations

findleyr commented 2 weeks ago

I got this working, and it was quite a bit harder than expected. Furthermore, the change in loading behavior has approximately no impact on performance when used with go list. A few notes:

Therefore, I'm not sure whether this is worth doing. It's a lot of complexity for little gain. I think the more impactful change is to be able to run completion on stale metadata: I propose we skip straight to that.

JamyDev commented 2 weeks ago

gopls currently overwrites package metadata with all newly loaded data (assuming the new data is better than the old data) Is there some way of versioning the data that we don't have to replace everything? Is this currently done with some hashing mechanism? Reason I ask is that in packagesdriver mode, I'm not sure how it would differentiate a stale package in the tree from an unchanged one.

go list has approximately the same performance when invoked with/without -deps=true. This is true in packagesdriver mode as well. There's a good chance that, unless already resolved, a package would require a costly resolve where the Imports is available anyways.

able to run completion on stale metadata This would be amazing!