bazelbuild / rules_go

Go rules for Bazel
Apache License 2.0
1.37k stars 649 forks source link

bug(rules/compilepkg): cgo2 is invoked _twice_ when using coverage instrumentation #4110

Open srosenberg opened 1 day ago

srosenberg commented 1 day ago

What version of rules_go are you using?

upstream release-0.46 (https://github.com/cockroachdb/rules_go/commits/release-0.46/) plus a few patches

What version of gazelle are you using?

v0.33.0-0-g061cc37

What version of Bazel are you using?

7.2.1

Does this issue reproduce with the latest releases of all the above?

Yes.

What operating system and processor architecture are you using?

Linux x86_64

Any other potentially useful information about your toolchain?

What did you do?

Attempted to build a Go package (from CockroachDB) with coverage instrumentation,

 bazel build //pkg/build:build --config=ci -c opt --collect_code_coverage

The build failed due to nogo,

compilepkg: nogo: errors found by nogo during build-time code analysis:
/tmp/rules_go_work-2407553143/cgo/github.com/cockroachdb/cockroach/pkg/build/info.pb_1.go:276:17: unnecessary conversion (unconvert)

Removing --collect_code_coverage from the above yields a successful build,

bazel build  //pkg/build:build --config=ci -c opt
...
INFO: Build completed successfully, 64 total actions

What did you expect to see?

I expected the build to succeed when using --collect_code_coverage.

What did you see instead?

Instead, the build failed for a rather obscure reason (see below).

srosenberg commented 1 day ago

As the title suggests, cgo2 is invoked twice when using coverage instrumentation [1]. The side-effect is gatherSrcs [2] may end up copying or symlinking the same source file twice. In our case, the source file info.pb.go was symlinked twice. The first symlink corresponds to the first invocation of cgo2; this goSrc is uninstrumented because it's generated (from protobuf). The second symlink corresponds to the second invocation of cgo2, where goSrcsNogo has the original source file. Thus, we end up with the following content in the workDir,

lrwxrwxrwx 1 srosenberg srosenberg   250 Sep 19 03:05 info.pb.go -> /home/srosenberg/.cache/bazel/_bazel_srosenberg/3f1e38fca2391d213d04e1064a5fee62/sandbox/linux-sandbox/4958/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-opt/bin/pkg/build/build_go_proto_/github.com/cockroachdb/cockroach/pkg/build/info.pb.go
lrwxrwxrwx 1 srosenberg srosenberg   250 Sep 19 03:05 info.pb_1.go -> /home/srosenberg/.cache/bazel/_bazel_srosenberg/3f1e38fca2391d213d04e1064a5fee62/sandbox/linux-sandbox/4958/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-opt/bin/pkg/build/build_go_proto_/github.com/cockroachdb/cockroach/pkg/build/info.pb.go

Luckily, the above still compiles since the second call to cgo2 ignores the resulting goSrcs; i.e., it's still referring to info.pb.go. However, nogo fails because the file exclusion filter doesn't match the suffix _[0-9]+.go. In our case, the workaround is trivial (update exclusion filter). However, this issue led to a surprisingly long debugging session; the fact that the nogo error message was seemingly referring to a non-existent file (info.pb_1.go), since workDir is nuked by default, was a big part of it.

[1] https://github.com/bazelbuild/rules_go/blob/8b8294b6359a9c0a89af968fb7092f6e7194f420/go/tools/builders/compilepkg.go#L312 [2] https://github.com/bazelbuild/rules_go/blob/8b8294b6359a9c0a89af968fb7092f6e7194f420/go/tools/builders/cgo2.go#L367-L373

fmeum commented 1 day ago

There have been quite a lot of changes to nogo since 0.46.0. Could you check whether 0.50.1 fixes this? If not, a PR would be very welcome.