cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.51k stars 3.7k forks source link

build: Bazel rebuilding unchanged targets often #76851

Open dt opened 2 years ago

dt commented 2 years ago

Over the past few weeks of using exclusively bazel, I've noticed it regularly rebuilding things that it doesn't seem like it should be rebuilding. For example, I've noticed it rebuilding protobuf c and java files, or external go dependencies like bcrypt that haven't changed. Most recently this morning I'm seeing it rebuild jemalloc, which hasn't been updated since 2020.

At first I thought this was dev messing with PATH or something that busted cache fingerprints, but for the past two weeks I've switched to exclusively using bazel directly and controlling exactly what flags I'm passing to it to eliminate that variable. Then I thought it might be bazel 4 vs 5 upgrade as I switch branches, but I've now seen it recurring even when all my active branches are rebased to post 5 upgrade. Even then I'd have expected the "remote" cache to eventually have the 4 and 5 fingerprint but now, I'm still seeing rebuilds of e.g. protoc's C files regularly.

I haven't setup an isolated reproduction yet; this is just based on happening to notice a target name as it flashes by while working on other things. This might be a red herring, but it feels like I notice it most when switching from build //pkg/cmd/cockroach-short to run //pkg/gen and back.

Epic CRDB-17171 Jira issue: CRDB-13302

dt commented 2 years ago

Here are some examples. The rebase was a --exec 'bazel run :gazelle' IIRC

Screen Shot 2022-02-18 at 9 13 29 AM Screen Shot 2022-02-20 at 6 11 41 PM Screen Shot 2022-02-21 at 8 10 52 AM
petermattis commented 2 years ago

You can reproduce a surprising rebuild of libjemalloc here by doing a bazel clean before running bazel build //pkg/cmd/cockroach-short and then bazel run //pkg/gen. (The remote caching makes this fairly fast to experiment with as bazel clean only cleans the local bazel state).

I think the rebuilds here are due to the distinct host vs request configurations. See https://docs.bazel.build/versions/main/guide.html#--distinct_host_configurationfalse which talks about this. When running //pkg/gen we're building tools using the host configuration, but when building //pkg/cmd/cockroach-short we're using the request configuration. By default, the host compilation mode is opt while the request compilation mode is fastbuild. I would have thought that --distinct_host_configuration=false would help here, but it doesn't, or I'm doing something wrong.

PS It's surprising that //pkg/gen depends on //c-deps:libjemalloc. There seem to be some unfortunate dependencies for gomock targets. For example, //pkg/cmd/roachtest/tests:mocks_drt depends on //pkg/cmd/roachtest/tests:tests which depends on //pkg/cli. Does mockgen need all of the dependencies to be built? I would have thought it only needed any generated code in dependencies to be generated.

ajwerner commented 2 years ago

PS It's surprising that //pkg/gen depends on //c-deps:libjemalloc. There seem to be some unfortunate dependencies for gomock targets. For example, //pkg/cmd/roachtest/tests:mocks_drt depends on //pkg/cmd/roachtest/tests:tests which depends on //pkg/cli. Does mockgen need all of the dependencies to be built? I would have thought it only needed any generated code in dependencies to be generated.

I did something about this in https://github.com/cockroachdb/cockroach/pull/77159

stevekuznetsov commented 2 years ago

Seems like a problem forever with Bazel: https://github.com/bazelbuild/bazel/issues/7095

rickystewart commented 2 years ago

With regard to protoc in particular we might be able to switch over to using a prebuilt protoc for Linux, Mac, and Windows.

smolkaj commented 7 months ago

Did you figure out the problem? I'm running into a similar issue.

rickystewart commented 7 months ago

@smolkaj I seem to remember that one problem @dt was having is he was using GOPACKAGESDRIVER, spawning bazel invocations from an IDE, and that was running the build with a completely different set of environment variables and everything.

Are you regularly switching "environments" (shells, whatever)? If so, it's a good idea to make sure those environments are as similar as possible. For example environment variables like PATH can cause cache fingerprints can be invalidated.

smolkaj commented 7 months ago

Thanks, @rickystewart!

Are you regularly switching "environments" (shells, whatever)?

Shouldn't be. I'm having this problem in the context of a continuous integration script using Github Actions with caching of ~/.cache/bazel.