bazel-contrib / rules_go

Go rules for Bazel
Apache License 2.0
1.38k stars 661 forks source link

Using rules_go with go modules and generated packages #2262

Open robbertvanginkel opened 5 years ago

robbertvanginkel commented 5 years ago

_First, I'm not sure if I should file this on rulesgo, bazel-gazelle or golang. Please let me know if there's a better forum.

The experience of building go code with rules_go/gazelle works pretty well in general. Unfortunately when using both go modules for dependency management and rules for autogenerating go packages (such as go_proto_library or gomock), the experience breaks down a bit.

Consider the following project:

--- BUILD.bazel ---
load("@bazel_gazelle//:def.bzl", "gazelle")

# gazelle:prefix github.com/example/project
gazelle(name = "gazelle")

--- cmd/main.go ---
package main // import "github.com/example/project/cmd"

import "fmt"

func main() {
       fmt.Println("Hello!")
}

With a standard workspace file (I tested go1.13.3, rules_go v0.20.1, gazelle 0.19.0, bazel 1.1.0) it is straightforward to get the program running:

$ bazel run //:gazelle
$ bazel run //cmd
Hello 03ec1161-9f7d-43f2-80d3-95522e517d7a!

Adding a dependency with go mod is also straightforward:

$ diff cmd/main.go
diff --git a/cmd/main.go b/cmd/main.go
index d938538..55e7774 100644
--- a/cmd/main.go
+++ b/cmd/main.go
@@ -1,7 +1,11 @@
 package main // import "github.com/example/project/cmd"

-import "fmt"
+import (
+       "fmt"
+
+       "github.com/gofrs/uuid"
+)

 func main() {
-       fmt.Println("Hello!")
+       fmt.Printf("Hello %v!\n", uuid.Must(uuid.NewV4()))
 }
$ go get github.com/gofrs/uuid@latest
$ go mod tiy
$ bazel run //:gazelle -- update-repos -from_file=go.mod
$ bazel run //:gazelle
$ bazel run //cmd
Hello 03ec1161-9f7d-43f2-80d3-95522e517d7a!

When starting to consume some generated code, at first all seems fine:

diff --git a/cmd/main.go b/cmd/main.go
index 55e7774..4d3fae0 100644
--- a/cmd/main.go
+++ b/cmd/main.go
@@ -3,9 +3,11 @@ package main // import "github.com/example/project/cmd"
 import (
        "fmt"

+       "github.com/example/project/proto"
        "github.com/gofrs/uuid"
 )

 func main() {
        fmt.Printf("Hello %v!\n", uuid.Must(uuid.NewV4()))
+       fmt.Printf("Hello %v!\n", proto.Polyglot{})
 }

Where github.com/example/project/proto is a generated golang package:

--- proto/BUILD.bazel ---
# stub rule for generated go code, in practice imagine proto/gomock rules here
genrule(
    name = "genproto",
    cmd = "echo 'package proto\n\ntype Polyglot struct{}' > $@",
    outs = ["proto.go"],
)

After generating the rules everything seems to work fine:

$ bazel run //:gazelle
$ bazel run //cmd
Hello 03ec1161-9f7d-43f2-80d3-95522e517d7a!
Hello {}!

But when at a later stage, you try to add a new dependency to your project, go get will still work, while go mod tidy will start throwing errors like:

$ go get golang.org/x/text
$ go mod tidy
github.com/example/project/cmd imports
    github.com/example/project/proto: git ls-remote -q https://github.com/example/project in /Users/robbert/gocode/pkg/mod/cache/vcs/48e4a55da23b18d4dd53d568f6a9a78ee3195ecd4c570168b7f17b2d37a13a26: exit status 128:
    fatal: could not read Username for 'https://github.com': terminal prompts disabled
Confirm the import path was entered correctly.
If this is a private repository, see https://golang.org/doc/faq#git_https for additional information.

So far, we'd come up with the following to get around this:

As far as I've read into the design and code of go modules, it seems pretty tightly coupled with the default go build system. That's unfortunate, as we like Bazel due to some codegen/caching features, but not being able to use both the standard go modules dependency management resolution logic and generated go packages at the same time is a bit sour.

Are there any known workarounds to this? Maybe there is a way for rules_go/bazel could inform go modules about which packages are expected to exist? Or would we need to open an issue with golang to see if the modules functionality can somehow be exposed for usage with 3rd party buildsystems?

priyendra commented 5 years ago

I have been using the dummy.go trick as well.

jayconrod commented 5 years ago

So if I can summarize a bit, the issue is that you depend on a package that only contains generated code (github.com/example/project/proto), but go mod tidy and other Go commands report errors when you import that package because it doesn't contain any static .go files.

You're correct that Go modules are very much integrated into the go command. Code generation is very much not integrated into the go command. There is go generate, but that's not part of the regular build, and it only works in the main module. That does make generated code somewhat difficult to handle.

There are a number of workarounds, some of which you've already found.

Beyond that, I'm not sure I have a generally good solution to recommend for you. Kubernetes ran into some of the same issues. Bazel appealed to them because it seemed like they could remove a lot of their generated code from their repo, but non-Bazel users still needed to import their packages, so they were never really able to do that.

I'm open to solutions on the Gazelle side that don't diverge too far from what the go command does. Currently, gazelle update-repos -from_file=go.mod runs go list -m -json all gathers information about modules in the build list, then translates that into go_repository rules. After that, Gazelle has fairly minimal interaction with modules.

It's unlikely that fully general code generation will be integrated into go build. That would mean Go would need to build and execute tools written in other languages. It might need to interact with other dependency management systems. I think we'd end up with a worse version of Bazel if we followed that path.

robbertvanginkel commented 5 years ago

That pretty much sums it up.

To add some clarifications on our situation: the repository we use the go.mod file in is an internal monolithic repository with a collection of service and library code. The goal for using go modules is to manage a single version for the dependencies for all projects in it. There is no intention of making this module importable, so the generated code for non bazel users isn't a major concern for us.

Manually editing the module file could be a possibility, but manually having to trim the module and sum file would be error prone with a large group of developers. I guess what I'm looking for is some way to do maintainence like go mod tidy but with information about the existing packages and imports comping from Bazel/rules_go rather than go build.

jayconrod commented 5 years ago

You may want a custom tool for this. At one point, I wanted gazelle update-repos to be able to do this kind of thing, but it seemed like in the general case, there would be scaling and correctness problems with a large number of repos.

There are primitives available you may find useful.

blico commented 4 years ago

@jayconrod

I would like to hear your feedback on another approach we have thought of for our monorepo, to help go modules work with generated packages.

So like you said, the source of our problem is go mod tidy reports errors when some code within a module imports a package that doesn't contain any static .go files. Meaning, if we never feed those packages to go mod tidy, then it will not complain.

The idea is that we can filter all of the imports in our repo to only what go mod tidy cares about, and then place the filtered imports into a single .go file, let's call it imports.go. Our filtering function for imports.go should remove all internal imports (including the imports of generated code), and leave us with only external imports. It essentially will look like this:

 (all imports in every go file) - (all importpath attrs known to bazel under //...)

We then can move our go.mod/go.sum outside of the main module root and into a directory containing only imports.go and tools.go. Within this "shadow module" is where commands such as go get -d and go mod tidy will be run. Before go mod tidy is run, we will have to make sure imports.go is up to date.

With this "shadow module" approach, we will be able to use go modules for dependency management, and bazel for building and code generation.

Known limitations:

  1. go build in module mode will not work - this is ok for us, as we use either bazel build or vendored go build in gopath mode
  2. our source module will not be importable from other repos - this is fine because we have a monorepo
  3. go modules will not know about generated code's external dependencies- this is already the case for us today, but if it ever becomes an issue, we could modify our filtering logic to include these imports

Are there any other limitations or gotchas you think we may be missing?

jayconrod commented 4 years ago

@blico I think that will work.

How are you planning to list all imports? I can think of a couple different ways.Something built with bazel query 'deps(//...) might work, printing the importpath attribute for every library. An aspect would work, too, though it's more complicated.

Once you have that list, using an imports.go / tools.go file in a shadow module should work fine.

blico commented 4 years ago

Our idea right now is to:

  1. bazel query --output=proto //...:* to find all of the non-external srcs and importpath attributes known to Bazel.
  2. Read all of the imports from the .go srcs gathered in the first query (using go/parser)
  3. From the imports collected in the second step, filter out any imports that also were found in the first step

Because we are using Bazel as our source of truth with this approach, an added requirement for users is they will need to run gazelle before running go mod tidy, whereas previously it was possible for them to only run go mod tidy.

c4milo commented 3 years ago

@blico, @robbertvanginkel, are you still using the same approach? I'm bumping into this as well.

linzhp commented 3 years ago

Yes, we are still using @blico's approach (I am in the same team as @blico and @robbertvanginkel).

clstb commented 3 years ago

A solution that I came up with to this problem is the following:

Suppose we have a directory of proto files. Add a file gen.go to it with following content:

package pb

//go:generate protoc --go_out=module=<dir>:. --go-grpc_out=module=<dir>:. some.proto

Add *.go and BUILD.bazel (because it will depend on the generated code) files to .gitignore for that directory. Add an exclusion rule for gen.go. Depending on the structure of your monorepo you get away with just a couple of lines.

Disable proto rule generation in gazelle using # gazelle:proto disable in the root BUILD.bazel.

Now to get everything running for a freshly cloned repo do the following:

go generate ./...
go mod tidy
bazel run //:gazelle -- update-repos -from_file=go.mod -to_macro=repositories.bzl%go_repositories
bazel run //:gazelle
bazel build //...

This solution should work for any code you can generate using a //go:generate comment. Also a benefit might be that all LSP related things just work, theGOPACKAGESDRIVER is still experimental and I had quite some problems with it.

Note that this pushes dependency management of code generators to the underlying system (your CI or Container this runs in).

Ideally there should be a native solution for this problem but for the moment this is simple and robust.

ql-owo-lp commented 2 years ago

After investigating all above solutions, I think using the stackb rules is best for me: https://github.com/stackb/rules_proto

Just use the proto_compiled_sources rule, then use the following script to update protobuf generation. (Assuming all protobuf files are stored in proto directory).

    echo "Cleaning up existing generated protobuf files..."
    #find "proto/" -name BUILD.bazel -delete # uncomment if you rely on gazelle to generate rules.
    find "proto/" -name "*.pb.go" -delete
    echo "Generating bazel BUILD rules..."
    bazel run //:gazelle
    echo "Compiling protobuf files..."
    bazel query "kind('proto_compile rule', //proto/...)" | tr '\n' '\0' | xargs -0 -n1 bazel build
    bazel query "kind('proto_compile_gencopy_run rule', //proto/...)" | tr '\n' '\0' | xargs -0 -n1 bazel run
    bazel query "kind('proto_compile_gencopy_test rule', //proto/...)" | tr '\n' '\0' | xargs -0 -n1 bazel test

In my case, both my upstream repositories and downstream repositories do not use bazel (yeah I am the only one promoting it due to my past Googler experience), so proto_compiled_sources provide the most compatibility. With that bazel can live with go mod tidy without problem.

uhthomas commented 2 years ago

Currently we're just adding empty Go files to generated packages and excluding said files from Gazelle. The empty Go files do not have build constraints (//go:build ignore) as the Go toolchain will still assume the package to be empty otherwise. Not great, but not terrible either. It would be really nice for Gazelle to handle things better.

empty.go
package mock
BUILD.bazel
# gazelle:exclude **/mock/empty.go

One thought I had: I wonder if it's possible to use the GOPACKAGESDRIVER (https://github.com/bazelbuild/rules_go/issues/512) with gopls to tidy modules?

andruwm commented 2 years ago

@blico IS the tool you use open source?