Open dzbarsky opened 3 months ago
While you are down this path, I would like to note that we were passing a lot of duplicated information through our providers. As a result, a query using --output=starlark --starlark:expr='providers(target)...'
might just straight up crash as Bazel tries to render the massive provider to string.
So a potential optimization would be to reduce the duplicated information between different providers we are returning for each rule.
Also, it would be nice to know how you benchmark these improvements and what improvement metrics you are seeing on your end.
I would love to rerun that using BuildBuddy repo with/without RBE.
The providers also have lots of depsets of structs where structs of depsets would be much more memory-efficient. While technically public API, I would say change whatever shows clear benefits.
Agree.
Here is an example of running a simple cquery on our repo that causes Bazel to crash
# buildbuddy.git
> bazel cquery --config=remote --output=starlark //server/util/db:db_test --starlark:expr='providers(target)'
FATAL: bazel ran out of memory and crashed. Printing stack trace:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.base/java.util.Arrays.copyOf(Unknown Source)
at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
at java.base/java.lang.AbstractStringBuilder.append(Unknown Source)
at java.base/java.lang.StringBuilder.append(Unknown Source)
at net.starlark.java.eval.Printer.append(Printer.java:53)
at net.starlark.java.eval.RegularTuple.repr(RegularTuple.java:115)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at net.starlark.java.eval.Printer.printList(Printer.java:101)
at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at net.starlark.java.eval.Printer.printList(Printer.java:101)
at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at net.starlark.java.eval.Printer.printList(Printer.java:101)
at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at net.starlark.java.eval.Printer.printList(Printer.java:101)
at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
at net.starlark.java.eval.Printer.repr(Printer.java:195)
at net.starlark.java.eval.Printer.printList(Printer.java:101)
If we run .keys()
on the provider, it's only
["LicenseInfo", "@@io_bazel_rules_go//go/private:providers.bzl%GoArchive", "InstrumentedFilesInfo", "RunEnvironmentInfo", "FileProvider", "FilesToRunProvider", "OutputGroupInfo"]
Some stuff like <GoArchive>.direct
could be massive and recursive, which I hope using a depset and flattening them out would help a bit.
@fmeum @sluongng I am playing with bazel shutdown; bazel build --nobuild //... --starlark_cpu_profile=/tmp/profile2.pb.gz
on buildbuddy OS repo; it seems to be a decent testbed. I also tried TIDB but couldn't get it to build, and I validate on cockroach but they have a patched rules_go so it's a bit harder to apply my overrides.
I did notice the provider are a mess, I will poke at it a bit and see if we can clean things up there. I can repro the cquery
crash
https://github.com/bazelbuild/rules_go/pull/4023, https://github.com/bazelbuild/rules_go/pull/4022/files
@fmeum @sluongng So I did some investigation, and I think it's currently hard to make big improvements to the ArchiveData setup. The problem is _recompile_external_deps
; it potentially walks over the entire transitive closure for each test and recompiles a subset of the dependencies. This requires each GoArchive to hold on to a lot of extra data, since it may be needed to create the shadow GoArchive.
The doc comment summarizes it well:
the library under test must not be linked into the test
binary, since the internal test archive embeds the same sources.
Libraries imported by the external test that transitively import the
library under test must be recompiled too, or the linker will complain that
export data they were compiled with doesn't match the export data they
are linked with.
This function identifies which archives may need to be recompiled, then
declares new output files and actions to recompile them. This is an
unfortunately an expensive process requiring O(V+E) time and space in the
size of the test's dependency graph for each test.
I'm guessing breaking this functionality is not a great option (though I am pretty sure this is what we did at Dropbox; we just didn't support external tests in the same package).
Failing that, it seems like we may be able to do this in a more Bazel-friendly way with an aspect; we may want to apply it conditionally only when there are external tests. May need some Gazelle support. What do you guys think?
hmm, I wonder if there is a way for us to detect this early on, perhaps during the first GoCompilePkg of the internal test, to just produce 2 sets of outputs: original archive and re-compiled archive. If the re-compilation is not needed, then just create empty files.
Then let the external test depend on both of those output groups and determine which one to use at runtime of GoCompilePkgExternal
.
So we push more work to the action itself and let analysis be lighter weight.
For context, you could see how this logic is used via
> bazel aquery tests/core/go_test:indirect_import_test
And comparing the 2 actions, they are fairly similar:
@@ -1,12 +1,11 @@
-action 'GoCompilePkg tests/core/go_test/indirect_import_test.internal.a'
+action 'GoCompilePkg tests/core/go_test/indirect_import_test.internal.recompileinternal.a'
Mnemonic: GoCompilePkg
Target: //tests/core/go_test:indirect_import_test
Configuration: darwin_arm64-fastbuild
Execution platform: @internal_platforms_do_not_use//host:host
- ActionKey: 403aabc176ca6f0cb305a3eb00a68da2c7ed1999879ead44ca60b3d98740a562
+ ActionKey: bd10d9739be121241ad208d405c8645e1665eff3da8213e723c4771123046265
Inputs: [
bazel-out/darwin_arm64-fastbuild/bin/stdlib_/pkg,
- bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_dep.x,
bazel-out/darwin_arm64-opt-exec-ST-a828a81199fe/bin/external/go_sdk/builder_reset/builder,
bazel-out/darwin_arm64-opt-exec-ST-a828a81199fe/bin/external/go_sdk/packages.txt,
external/go_sdk/bin/gofmt,
@@ -36,7 +35,7 @@
tests/core/go_test/indirect_import_lib.go,
tests/core/go_test/indirect_import_x_test.go,
]
- Outputs: [bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.a, bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.x]
+ Outputs: [bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.recompileinternal.a, bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.recompileinternal.x]
Environment: [CGO_ENABLED=1, GOARCH=arm64, GOEXPERIMENT=nocoverageredesign, GOOS=darwin, GOPATH=, GOROOT_FINAL=GOROOT, GOTOOLCHAIN=local]
Command Line: (exec bazel-out/darwin_arm64-opt-exec-ST-a828a81199fe/bin/external/go_sdk/builder_reset/builder \
compilepkg \
@@ -58,8 +57,8 @@
bazel-out/darwin_arm64-fastbuild/bin \
-embedlookupdir \
tests/core/go_test \
- -arc \
- 'github.com/bazelbuild/rules_go/tests/core/go_test/indirect_import_dep=github.com/bazelbuild/rules_go/tests/core/go_test/indirect_import_dep=bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_dep.x' \
+ -recompile_internal_deps \
+ github.com/bazelbuild/rules_go/tests/core/go_test/indirect_import_dep \
-importpath \
github.com/bazelbuild/rules_go/tests/core/go_test/indirect_import \
-p \
@@ -67,9 +66,9 @@
-package_list \
bazel-out/darwin_arm64-opt-exec-ST-a828a81199fe/bin/external/go_sdk/packages.txt \
-lo \
- bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.a \
+ bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.recompileinternal.a \
-o \
- bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.x \
+ bazel-out/darwin_arm64-fastbuild/bin/tests/core/go_test/indirect_import_test.internal.recompileinternal.x \
-testfilter \
exclude \
-gcflags \
(edit: new diff with inputs split into multi-lines and sorted)
So if we ever go down this road:
Move the recompile logic from starlark into compilepkg.go
Make GoCompilePkgInternal
to always output .a, .x
in one output group and .recompileinternal.a, .recompileinternal.x
in another output group. If recompile is not needed, just create empty files with touch ...recompileinternal.{a,x}
.
Most downstream actions depends on the first output group. GoCompilePkgExternal
to depends on both output groups.
GoCompilePkgExternal
check if the .recompileinternal.{a,x}
files are not empty before using them.
This could be quite a lot of work though 🤔 Not sure if the benefit is worth it just yet. I have not heard of Bazel + rules_go repo that is heavily impacted by a bloated analysis cache.
@sluongng how would your proposal cover the following scenario:
I believe what happens today is when compiling foo_test.internal, we create a baz.recompile
, a bar.recompile
, and foo_test
that depends on them. Are you suggesting that the action that emits foo_test
should do the recompiles as well? perhaps I'm not understanding how this setup works 😄
Are you suggesting that the action that emits foo_test should do the recompiles as well?
I did not notice that the transitive deps also need to be recompiled. That made things much more complicated.
Inspecting the query result a bit closer, we have something like this
which, could be summarized as
+--------------------------+
| |
| V
lib.go --> dep.go-->+--> i_test.go
|
|
+--> x_test.go
# Compile lib
GoCompilePkg(lib.go) -> lib.{a,x}
# Compile dep
GoCompilePkg(lib.x + dep.go) -> dep.{a,x}
# Compile internal test
GoCompilePkg(lib.go + i_test.go + x_test.go) -> test.internal.{a,x}
# Recompile internal test
GoCompilePkg(lib.go + i_test.go + x_test.go) -> recompileinternal.{a,x}
# Recompile dep
GoCompilePkg(recompileinternal.x + dep.go) -> dep.recompile2.{a,x}
# Compile external test
GoCompilePkgExternal(dep.recompile2.x + recompileinternal.x + i_test.go + lib.go + x_test.go) -> test.external.{a,x}
... GoTestGenTest, GoCompilePkg, GoLink, TestRunner, etc...
So yeah, if we want to trip this, it would have to be some aspect-like setup to trigger the recompiliation of transitive deps.
What's the use case for transitive (rather than direct deps only) recompiles? Does Gazelle ever generate target structures in which this is needed? If the answer is no, we could think of deprecating this feature.
I wonder how go build
works for cases like this.
Perhaps the logic has been updated since rules_go added this logic.
So I recreated :indirect_import
in a separate test directory
test-go> fd
dep/
dep/dep.go
go.mod
go.sum
i_test.go
lib.go
x_test.go
Then do a build like this
> go clean -cache
> go clean -modcache
> GOMAXPROCS=1 go test -a -x ./... 2>&1 | tee -a temp.txt
and by analyzing the build log, we get
~/work/misc/test-go> rg 'compile.*github.com/foo' temp.txt
1148:/opt/homebrew/Cellar/go/1.22.6/libexec/pkg/tool/darwin_arm64/compile -o $WORK/b059/_pkg_.a -trimpath "$WORK/b059=>" -p github.com/foo/lib -lang=go1.22 -complete -buildid FzP5PESwWtB3fkksUjBJ/FzP5PESwWtB3fkksUjBJ -goversion go1.22.6 -shared -nolocalimports -importcfg $WORK/b059/importcfg -pack ./lib.go
1157:/opt/homebrew/Cellar/go/1.22.6/libexec/pkg/tool/darwin_arm64/compile -o $WORK/b001/_pkg_.a -trimpath "$WORK/b001=>" -p github.com/foo/lib -lang=go1.22 -complete -buildid 17icfmWAlM18h8zvQ6NY/17icfmWAlM18h8zvQ6NY -goversion go1.22.6 -shared -nolocalimports -importcfg $WORK/b001/importcfg -pack ./lib.go ./i_test.go
1165:/opt/homebrew/Cellar/go/1.22.6/libexec/pkg/tool/darwin_arm64/compile -o $WORK/b093/_pkg_.a -trimpath "$WORK/b093=>" -p github.com/foo/lib/dep -lang=go1.22 -complete -buildid qtinFOy_iT9gdmoMQdlc/qtinFOy_iT9gdmoMQdlc -goversion go1.22.6 -shared -nolocalimports -importcfg $WORK/b093/importcfg -pack ./dep/dep.go
1174:/opt/homebrew/Cellar/go/1.22.6/libexec/pkg/tool/darwin_arm64/compile -o $WORK/b092/_pkg_.a -trimpath "$WORK/b092=>" -p github.com/foo/lib_test -lang=go1.22 -complete -buildid rRb_jlTN22n2Y0ur-Rn3/rRb_jlTN22n2Y0ur-Rn3 -goversion go1.22.6 -shared -nolocalimports -importcfg $WORK/b092/importcfg -pack ./x_test.go
5439:/opt/homebrew/Cellar/go/1.22.6/libexec/pkg/tool/darwin_arm64/compile -o $WORK/b094/_pkg_.a -trimpath "$WORK/b094=>" -p github.com/foo/lib/dep -lang=go1.22 -complete -buildid hT7NsVn5ELQGQJaIyxkg/hT7NsVn5ELQGQJaIyxkg -goversion go1.22.6 -shared -nolocalimports -importcfg $WORK/b094/importcfg -pack ./dep/dep.go
So interestingly:
package lib
was compiled once with lib.go
for the main library in b059
, then the archive is used to compile the package again for internal test with lib.go + i_test.go
.
package lib/dep
was compiled twice in b093
and b094
. However only b093/_pkg_.a
was used to link the binary. b094/_pkg_.a
was not used at all.
Here is the full log https://gist.github.com/sluongng/e2d1cc724ad441c8664bc784e4f739ac
@sluongng Thanks for posting the log, sorry about the delay, was finishing some other things and then was sick.
I'm still wading through the log file, there's a ton in there. Do you have a sense of what we need to fix yet? (Or how the normal go tooling is able to avoid the transitive recopmilation issue?)
discussed with @dzbarsky a bit on Slack about this:
go test ./...
. The issue https://github.com/bazelbuild/rules_go/issues/1877 provides really good historical context for thisJay Conrod:
The go command recompiles packages that import the library under test to import the internal test package instead. It also recompiles everything that imports recompiled packages, and so on. That's the O(n*t) behavior.
It's harder to do this in rules_go, since individual rules don't have access to the full configured target graph. We'd have to either carry additional information in providers (whether we use it or not), or we'd have to use an aspect (same, but more automatic). Either way, it's pretty expensive.
We can somewhat try to "dodge" the problem at the Gazelle level: detect external / internal test and split it into 2 separate go_test targets. However, as :indirect_import
pointed out, under the current behavior, the internal test could influence the outcome of the external test when they are co-located (and vice-versa). So splitting them into 2 separate binaries might lose out on that effect and could potentially be a disruptive migration for downstream users.
We could potentially use aspects instead of stacked providers, in the hope that the aspects will be lazily collected only when they are needed. It's unclear to me whether this solution is feasible. Even if it's feasible, it's unclear if the usage of aspect will come with other kinds of overhead and negate the benefit from the current provider approach.
What version of rules_go are you using?
Tip
What version of gazelle are you using?
n/a
What version of Bazel are you using?
7.3 rc1
Does this issue reproduce with the latest releases of all the above?
yes
What operating system and processor architecture are you using?
darwin arm64
What did you do?
bazel build --nobuild --starlark_cpu_profile
What did you expect to see?
A relatively clean profile
What did you see instead?
Opportunities to improve things