sourcegraph / scip-go

SCIP indexer for Golang
Apache License 2.0
31 stars 10 forks source link

Support for other build systems through GOPACKAGESDRIVER #3

Open Tatskaari opened 1 year ago

Tatskaari commented 1 year ago

Hey! We use Please, which means that we don't have a go.mod in the root of our repo. Unfortunately this tool assumes that it can use go list and other go build specific commands during initialisation. Other users might be using Bazel or similar, and will encounter similar problems, where these commands won't necessarily work as expected or at all.

Fortunately, golang.org/x/tools/go/packages supports plugging in a different "go list driver" (example), so in theory, the indexer should be able to glean all the information it needs. This tool likely would need to be re-worked so that it doesn't require as much information up front about the "module" and it's dependencies, as it may not be indexing a module.

In our case, we don't even have a base import path. We work in a giant monorepo where the "go path" root is the repo root. We can still service import path wildcards producing all the information that go list --json would (i.e. resolving import paths to other packages), just the module field will be empty.

tjdevries commented 1 year ago

Thanks for making a new issue here :)

Do you know of any open source projects using please to build a go project? I'd love to take a peak at those and explore. If not, I'll do some looking around this week and see what I can dig up.

Tatskaari commented 1 year ago

Hey! There are a few around: https://github.com/banzaicloud/pipeline https://github.com/tcncloud/wollemi https://github.com/thought-machine/falco-probes

We haven't gotten the go list driver out the door though, so you might have better luck with Bazel: https://github.com/bazelbuild/bazel-gazelle

tjdevries commented 1 year ago

Ah, so there exists a go list driver for Please? (just not currently available to be run outside of your environments?)

I'll check out the stuff for bazel as well. Thanks

Tatskaari commented 1 year ago

Ah, so there exists a go list driver for Please?

Yeah, there's a prototype but it's not fit for purpose. I was trying to understand how this stuff works, and got the implementation fundamentally wrong. I think I know what to do now but haven't gotten around to it.

tjdevries commented 1 year ago

cool, i'll look into using an alternate driver and what we would need to do.

In this iteration, we are using the Module information on the packages, do you think it is still possible to populate the Module information with the Path and Version info? This is what I'm using currently to generate the symbols that are used to do cross-repo navigation.

https://sourcegraph.com/github.com/sourcegraph/scip-go/-/blob/internal/symbols/symbols.go?L91-100&subtree=true

tjdevries commented 1 year ago

Ah, I think I see the problem more clearly now -- sorry, I must have misunderstood at the beginning. The primary thing I need to remove are any calls to go list in the indexer -- either by using package.Load directly or allow allowing explicitly passing the information to the tool.

Looking into that now :)

JamyDev commented 1 year ago

Hey @tjdevries, I played around with scip-go a little at Uber, and I think if we can just make those go list pieces of data configurable from the cli args, that would likely be good enough to support other build systems.

JamyDev commented 1 year ago

I think one of the easier ways to test with a Go packages driver is to run against rules_go. It has a packagesdriver in tools/gopackagesdriver.sh, so you can just set the following env variables:

GOPACKAGESDRIVER_BAZEL_QUERY_SCOPE=//... GOPACKAGESDRIVER=tools/gopackagesdriver.sh

(Not defining the query_scope one will make it just spit out the stdlib, which could be useful too)

Tatskaari commented 1 year ago

I think if we can just make those go list pieces of data configurable from the cli args, that would likely be good enough to support other build systems.

This might be a me problem, but I don't have a module name to provide. If you absolutely need to know the root import path for the codebase, we can probably just do a big refactor.

JamyDev commented 1 year ago

This might be a me problem, but I don't have a module name to provide. If you absolutely need to know the root import path for the codebase, we can probably just do a big refactor.

No this is a fair usecase, if you don't have a module name then we need to get it from somewhere!

tjdevries commented 1 year ago

@Tatskaari Are you saying you don't have a module name for the project as a whole, but it's composed of smaller modules (where some depend on each other)?

In that case, I think we should be able to just override the modulename with ., which is kind of a placeholder for saying "private module" in scip. If we did this, then we could just skip running go list to try to guess what the current module is. Otherwise, we only use the builtin packages.Loader(...) which should respect environment variables for loading different styles of projects.

(sorry, I was traveling last week so I haven't had a chance to investigate this aspect of the project much more yet)

Tatskaari commented 1 year ago

Maybe an example will help. So we have a repo that vaguely has the following structure:

vault
|- core
  |- contracts
    |- grpc
      |- grpc.go
    |- main.go

common
  |-go
    |- db
      |- db.go
...

In vault/core/contracts/main.go, we might have:

package main

import (
    "common/go/db"
    "vault/core/contracts/grpc"
)

func main() {
    server := grpc.NewServer(db.NewDB())
}

These imports are resolved relative to the root of the repo. Perhaps we can pass in "." as the module name? I'm not sure how that will work as the import paths don't start with "." or anything.

We do actually want to refactor this to include a root import path so it might become thoughtmachine.net/core3/common/go/db, but it's a big refactor.

tjdevries commented 1 year ago

Hey, sorry for the long delay -- I think I'm at least pretty close to making it possible to skip all of the direct shelling to go with passing a set of command line arguments to scip-go.

If you do something like:

scip-go --module-name=NAME --module-version=VERSION --go-version=go1.X.Y

And possibly use the GOPACKAGESDRIVER to provide ways to load the dependencies, does that work for you?

(again, sorry for the delay. Got side tracked on a few other projects. I'm hoping to grab some time to fixup any problems here)

Tatskaari commented 1 year ago

No worries, things are pretty chaotic here too but will try and give this a go soon. Thanks for the hard work :slightly_smiling_face:

tjdevries commented 1 year ago

If you can try on any open source repos that are representative of some of the stuff that you all are doing, that'd be super awesome -- that way I can try it more on my end and iterate towards a solution without having to bother you a ton.

We're experimenting with bazel for building sourcegraph, so I might be able to try and run it locally against sourcegraph's repos, which would be a good test case for this.

ciarand commented 1 year ago

I have a test case repo if that's still helpful: https://github.com/aalyria/api

This builds with bazel 6.2 and uses the GOPACKAGESDRIVER trick to support the gopls LSP. Here's an example query without GOPACKAGESDRIVER being set. Note that all the references are in the same file:

$ gopls references ./cdpi_agent/agent.go:53:6
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:105:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:117:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:128:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:170:10-15
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:189:10-15
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:35:36-41
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:38:25-30
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:40:33-38
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:66:38-43
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:67:8-13
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:80:10-15
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:93:30-35

Here's with the GOPACKAGESDRIVER var set:

$ GOPACKAGESDRIVER=$PWD/tools/gopackagesdriver.sh gopls references ./cdpi_agent/agent.go:53:6
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:105:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:117:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:128:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:170:10-15
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:189:10-15
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:35:36-41
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:38:25-30
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:40:33-38
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:66:38-43
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:67:8-13
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:80:10-15
/home/ciaran/github.com/aalyria/api/cdpi_agent/agent.go:93:30-35
/home/ciaran/github.com/aalyria/api/cdpi_agent/node_controller.go:47:10-15

As expected, scip-go doesn't work (or at least doesn't generate anything useful):

$ GOPACKAGESDRIVER=$PWD/tools/gopackagesdriver.sh scip-go --module-name=aalyria.com/spacetime
scip-go
Resolving module name
  Go standard library version:  go1.20
  Resolved module name       :  aalyria.com/spacetime

Configuration:
  Skip Implementations: false
  Skip Test           : false
Loading Packages
Visiting Packages
Indexing Implementations
Visiting Project Files:
$ scip stats index.scip
{
  "documents": 0,
  "linesOfCode": 0,
  "occurrences": 0,
  "definitions": 0
}
crackcomm commented 10 months ago

Updating go.mod: golang.org/x/tools v0.16.1 - almost works but packages don't have populated modules:

error: during package loading: Should not be possible to have nil module for userland package: @//colab/examples:examples_lib github.com/crackcomm/colab/colab/examples

I tried fixing it. Including . in packages.Load populates Module field but doesn't resolve the generated code. I also tried including everything in load mode with no success. If I populate Module-s using pkg.PkgPath, the index includes both the generated and source code but with no version.

It might be that bazel package driver doesn't provide this information, I'm not sure.

podtserkovskiy commented 22 hours ago

Hey folks, I work at Meta on Buck2 integration for Go. We use Glean to store facts about source code. Glean supports SCIP format and I would like to integrate scip-go with Buck2. We have GOPACKAGESDRIVER for Buck2 (not open-sourced yet), so in theory it should work with scip-go. When I've tried to use it with scip-go I discovered few issues:

Would you mind if I make a change to enable supplying target patterns via cli args as other tools support, e.q. golang.org/x/tools/cmd/callgraph? As a result by default scip-go will use "./..." as a target pattern, but a user will be able to optionally specify exactly what they want to index, for example

scip-go ./...
scip-go std
scip-go foo//bar:baz
scip-go pattern=foo//... file=bar/baz.go

These patterns are described in golang.org/x/tools/go/packages doc.

The good news is when I hack these things locally scip-go produces a valid index that matches with the index produced by a similar go.mod project. 😀