Open davido opened 11 months ago
Could you test with --noreuse_sandbox_directories
? That's my best guess without a bisect.
Could you test with --noreuse_sandbox_directories? That's my best guess without a bisect.
Unfortunately, with this option the error is still present. Also, I downgraded to 7.0.0.rc2 (from 7.0.0rc3) it's still failing.
I also added "-s" option, and produced this verbose output:
https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-chrome-latest/40258/console
[...]
# Configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042
# Execution platform: @local_config_platform//:host
SUBCOMMAND: # //java/com/google/gerrit/git/testing:testing [action 'Building java/com/google/gerrit/git/testing/libtesting.jar (3 source files)', configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042, execution platform: @local_config_platform//:host, mnemonic: Javac]
(cd /home/jenkins/.cache/bazel/_bazel_jenkins/67bba20af71044f1eb598ecb44098f26/execroot/gerrit && \
exec env - \
LC_CTYPE=en_US.UTF-8 \
PATH=/home/jenkins/.cache/bazelisk/downloads/bazelbuild/bazel-7.0.0rc2-linux-x86_64/bin:/usr/lib/jvm/java-11-openjdk-amd64/bin:/usr/lib/jvm/java-11-openjdk-amd64/jre/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
external/remotejdk21_linux/bin/java '--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.model=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.processing=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.resources=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED' '--add-opens=java.base/java.nio=ALL-UNNAMED' '--add-opens=java.base/java.lang=ALL-UNNAMED' '-Dsun.io.useCanonCaches=false' -XX:-CompactStrings -Xlog:disable '-Xlog:all=warning:stderr:uptime,level,tags' -jar external/remote_java_tools/java_tools/JavaBuilder_deploy.jar @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/git/testing/libtesting.jar-0.params @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/git/testing/libtesting.jar-1.params)
# Configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042
# Execution platform: @local_config_platform//:host
SUBCOMMAND: # //java/com/google/gerrit/jgit:jgit [action 'Building java/com/google/gerrit/jgit/libjgit.jar (1 source file)', configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042, execution platform: @local_config_platform//:host, mnemonic: Javac]
(cd /home/jenkins/.cache/bazel/_bazel_jenkins/67bba20af71044f1eb598ecb44098f26/execroot/gerrit && \
exec env - \
LC_CTYPE=en_US.UTF-8 \
PATH=/home/jenkins/.cache/bazelisk/downloads/bazelbuild/bazel-7.0.0rc2-linux-x86_64/bin:/usr/lib/jvm/java-11-openjdk-amd64/bin:/usr/lib/jvm/java-11-openjdk-amd64/jre/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
external/remotejdk21_linux/bin/java '--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.model=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.processing=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.resources=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED' '--add-opens=java.base/java.nio=ALL-UNNAMED' '--add-opens=java.base/java.lang=ALL-UNNAMED' '-Dsun.io.useCanonCaches=false' -XX:-CompactStrings -Xlog:disable '-Xlog:all=warning:stderr:uptime,level,tags' -jar external/remote_java_tools/java_tools/JavaBuilder_deploy.jar @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/jgit/libjgit.jar-0.params @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/jgit/libjgit.jar-1.params)
# Configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042
# Execution platform: @local_config_platform//:host
SUBCOMMAND: # //java/com/google/gerrit/acceptance/config:config [action 'Building java/com/google/gerrit/acceptance/config/libconfig.jar (7 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor, AutoOneOfProcessor)', configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042, execution platform: @local_config_platform//:host, mnemonic: Javac]
(cd /home/jenkins/.cache/bazel/_bazel_jenkins/67bba20af71044f1eb598ecb44098f26/execroot/gerrit && \
exec env - \
LC_CTYPE=en_US.UTF-8 \
PATH=/home/jenkins/.cache/bazelisk/downloads/bazelbuild/bazel-7.0.0rc2-linux-x86_64/bin:/usr/lib/jvm/java-11-openjdk-amd64/bin:/usr/lib/jvm/java-11-openjdk-amd64/jre/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
external/remotejdk21_linux/bin/java '--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.model=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.processing=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.resources=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED' '--add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED' '--add-opens=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED' '--add-opens=java.base/java.nio=ALL-UNNAMED' '--add-opens=java.base/java.lang=ALL-UNNAMED' '-Dsun.io.useCanonCaches=false' -XX:-CompactStrings -Xlog:disable '-Xlog:all=warning:stderr:uptime,level,tags' -jar external/remote_java_tools/java_tools/JavaBuilder_deploy.jar @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/acceptance/config/libconfig.jar-0.params @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/acceptance/config/libconfig.jar-1.params)
# Configuration: f5d72005e5d4b70683fdbd12ff2cbfb779fc730d4f37f289f17efea5d0e4d042
# Execution platform: @local_config_platform//:host
ERROR: /home/jenkins/workspace/Gerrit-verifier-chrome-latest/gerrit/java/com/google/gerrit/jgit/BUILD:3:13: Compiling Java headers java/com/google/gerrit/jgit/libjgit-hjar.jar (1 source file) failed: Failed to fetch blobs because they do not exist remotely.: Missing digest: 5cb087fa259562b09dfdb79380f82501849de07f77ea3eb52941303af7532e7e/138756716 for bazel-out/k8-fastbuild/bin/external/rules_java_builtin/toolchains/platformclasspath.jar
ERROR: /home/jenkins/.cache/bazel/_bazel_jenkins/67bba20af71044f1eb598ecb44098f26/external/jgit/org.eclipse.jgit.http.server/BUILD:5:13: Building external/jgit/org.eclipse.jgit.http.server/libjgit-servlet-class.jar (35 source files) failed: Failed to fetch blobs because they do not exist remotely.: Missing digest: 5cb087fa259562b09dfdb79380f82501849de07f77ea3eb52941303af7532e7e/138756716 for bazel-out/k8-fastbuild/bin/external/rules_java_builtin/toolchains/platformclasspath.jar
Target //tools/maven:gen_api_install failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 8.404s, Critical Path: 7.11s
INFO: 24 processes: 23 internal, 1 linux-sandbox.
ERROR: Build did NOT complete successfully
bazelisk failed to build gen_api_install. Use VERBOSE=1 for more info
Build step 'Execute shell' marked build as failure
Finished: FAILURE
@tjgq Do you have an idea?
@bazel-io flag
@davido Do I understand it correctly that you're building with a disk cache, but not with a remote cache? Is this build clean or incremental? Do you have any sort of process that removes entries from the disk cache between builds?
@bazel-io fork 7.0.0
From the CI log, it seems like you are using remote cache and these errors were caused by remote cache eviction. Can you check whether adding flag --experimental_remote_cache_eviction_retries=5
resolves the issue?
@coeuvre how can this happen just with the local disk cache? Race between multiple workers?
I think they are using remote cache. The flag was passed with env:
[EnvInject] - Injecting as environment variables the properties content
BAZEL_OPTS=--remote_cache=https://gerrit-ci.gerritforge.com/cache
Also, xxx remote cache hit
indicates remote cache. For disk cache it would be xxx disk cache hit
.
First of all we are using a combination of RBE and local build.
Some stuff we can only test locally. The failing part is built locally on GCP-machines.
We have both options, disc cache and remote cache, see, e.g.
BAZEL_OPTS=--remote_cache=https://gerrit-ci.gerritforge.com/cache
However, we have this hidden logic on the CI side to take remote cache out of the picture,
if .bazelversion
file was changed:
if git show --diff-filter=AM --name-only --pretty="" HEAD \| grep -q .bazelversion
then
export BAZEL_OPTS=""
fi
This is the part of the CI that was failing:
bazelisk build $BAZEL_OPTS plugins:core release api
@lucamilanesio Are you aware of any cache evictions on the remote cache side recently?
So, to verify, that remote cache contributes to the problem, I upgraded (again) the Bazel version from 7.0.0rc2 to 7.0.0rc3, and uploaded a new patch set (22). As explained in my previous comment, this would skip remote cache usage and the verification was successful: [1].
I'm going to remove the changes in .bazelversion
and add the option --experimental_remote_cache_eviction_retries=5
, as suggested by @coeuvre .
[1] https://gerrit-review.googlesource.com/c/gerrit/+/387837/22
@coeuvre, adding --experimental_remote_cache_eviction_retries
options fixed the build.
@davido Can you confirm whether entries can spuriously disappear from your disk and/or remote cache in between builds? If they can, then you must use --experimental_remote_cache_eviction_retries
, possibly in conjunction with --experimental_remote_cache_lease_extension
. Otherwise, there might be a bug in Bazel.
@lucamilanesio Can you help to answer the @tjgq 's question?
Since it's still unclear if this is a Bazel bug, I'll remove this bug as a release blocker for 7.0. Closing https://github.com/bazelbuild/bazel/issues/20175.
@meteorcloudy Agreed. Let's close this then as not an issue.
@davido Can you confirm whether entries can spuriously disappear from your disk and/or remote cache in between builds?
They cannot disappear from the local disk, however, once a day during the remote cache cleanups, they can be removed remotely. The step that is failing though did not use any remote cache: how is that possible that Bazel would assume that the cache is remote if there isn't a remote cache configured?
It looks like the local cache "remembers" that it was fed by a remote cache, because the previous step actually used a remote cache for the intial build.
If they can, then you must use
--experimental_remote_cache_eviction_retries
, possibly in conjunction with--experimental_remote_cache_lease_extension
. Otherwise, there might be a bug in Bazel.
Well, but that isn't the case, as mentioned above.
If I add the remote cache URL in the .bazelrc
for making sure that is always used in all invocations, the problem disappear. Has something changed in the remote cache management between Bazel 7.0.0-rc2 and 7.0.0-rc3?
Reopening the issue, as we are seeing this on Gerrit CI again and this downstream issue with priority 0 was filed: 1.
Excerpt from downstream issue:
The build steps that are executed for the validation are:
#0
export BAZEL_OPTS=--remote_cache=https://gerrit-ci.gerritforge.com/cache
#1
bazelisk build $BAZEL_OPTS plugins:core release api
#2
tools/maven/api.sh install
#3
tools/eclipse/project.py --bazel bazelisk
Only the first build command above is using remote cache, the subsequent commands don't use remote cache, and started to consistently fail on Gerrit CI after bump of Bazel version from 7.0.0-rc2 and 7.0.0-rc3.
The second command: tools/maven/api.sh
is here: 2, and is actually running this build command (without remote-cache usage):
bazelisk build //tools/maven:gen_api_install
Which is failing with this error now:
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 892c651b04360ae932e9843f7d2233e4476e5f60dd835a865fb49bf7a48f6e66/230925 for bazel-out/k8-fastbuild/bin/external/sshd-sftp/jar/_ijar/jar/sshd-sftp/jar/sshd-sftp-2.10.0-ijar.jar
Target //tools/maven:gen_api_install failed to build
@coeuvre @tjgq @meteorcloudy @fmeum Any clue what is going on here and how can we further track it down?
In fact, passing: --experimental_remote_cache_eviction_retries=5
helps, but this is a wrong thing to do as a workaround to fix a build command, that shouldn't use remote cache in the first place, isn't it?
Also note, that if we pass the remote cache option to all three build commands above, they all succeed.
So, in both cases (with and without remote cache): we are using repository cache and disk cache, as part of the .bazelrc
:
--repository_cache=~/.gerritcodereview/bazel-cache/repository --disk_cache=~/.gerritcodereview/bazel-cache/cas
^^^ Can it be somehow related?
I can reproduce the issue locally now. As assumed, the problem is related to the disk cache.
Here are the steps:
$ docker pull buchgr/bazel-remote-cache
$ docker run -u 1000:1000 -v /path/to/cache/dir:/data \
-p 9090:8080 -p 9092:9092 buchgr/bazel-remote-cache \
--max_size 5
$ bazelisk build --remote_cache=http://server:9090 plugins:core release api
gerrit/.bazelrc
file is located in ~/.gerritcodereview/bazel-cache/cas
$ rm -rf ~/.gerritcodereview/bazel-cache/cas/
davido@localhost:~/projects/gerrit (master %>)$ tools/eclipse/project.py --bazel bazelisk
INFO: Invocation ID: 6084a97c-1b8d-4850-bcb1-f37c2f84fa37
INFO: Options provided by the client:
Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'info' from /home/davido/projects/gerrit/.bazelrc:
Inherited 'common' options: --noenable_bzlmod
INFO: Reading rc options for 'info' from /home/davido/projects/gerrit/.bazelrc:
Inherited 'build' options: --workspace_status_command=python3 ./tools/workspace_status.py --repository_cache=~/.gerritcodereview/bazel-cache/repository --action_env=PATH --disk_cache=~/.gerritcodereview/bazel-cache/cas --java_language_version=17 --java_runtime_version=remotejdk_17 --tool_java_language_version=17 --tool_java_runtime_version=remotejdk_17 --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --incompatible_strict_action_env --announce_rc
INFO: Invocation ID: 3e774f39-267e-4841-8a37-b1e2890edb39
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=147
INFO: Reading rc options for 'build' from /home/davido/projects/gerrit/.bazelrc:
Inherited 'common' options: --noenable_bzlmod
INFO: Reading rc options for 'build' from /home/davido/projects/gerrit/.bazelrc:
'build' options: --workspace_status_command=python3 ./tools/workspace_status.py --repository_cache=~/.gerritcodereview/bazel-cache/repository --action_env=PATH --disk_cache=~/.gerritcodereview/bazel-cache/cas --java_language_version=17 --java_runtime_version=remotejdk_17 --tool_java_language_version=17 --tool_java_runtime_version=remotejdk_17 --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --incompatible_strict_action_env --announce_rc
INFO: Analyzed target //tools/eclipse:main_classpath_collect (10 packages loaded, 182 targets configured).
INFO: Found 1 target...
Target //tools/eclipse:main_classpath_collect up-to-date:
bazel-bin/tools/eclipse/main_classpath_collect.runtime_classpath
INFO: Elapsed time: 1.093s, Critical Path: 0.81s
INFO: 2 processes: 2 internal.
INFO: Build completed successfully, 2 total actions
INFO: Invocation ID: 578b1a90-ad9f-478b-98f4-20818be06888
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=147
INFO: Reading rc options for 'build' from /home/davido/projects/gerrit/.bazelrc:
Inherited 'common' options: --noenable_bzlmod
INFO: Reading rc options for 'build' from /home/davido/projects/gerrit/.bazelrc:
'build' options: --workspace_status_command=python3 ./tools/workspace_status.py --repository_cache=~/.gerritcodereview/bazel-cache/repository --action_env=PATH --disk_cache=~/.gerritcodereview/bazel-cache/cas --java_language_version=17 --java_runtime_version=remotejdk_17 --tool_java_language_version=17 --tool_java_runtime_version=remotejdk_17 --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --incompatible_strict_action_env --announce_rc
INFO: Analyzed target //tools/eclipse:autovalue_classpath_collect (0 packages loaded, 7 targets configured).
INFO: Found 1 target...
Target //tools/eclipse:autovalue_classpath_collect up-to-date:
bazel-bin/tools/eclipse/autovalue_classpath_collect.runtime_classpath
INFO: Elapsed time: 1.111s, Critical Path: 0.69s
INFO: 2 processes: 2 internal.
INFO: Build completed successfully, 2 total actions
INFO: Invocation ID: 0c23b3fe-303d-4076-97c6-488fbf009f94
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=147
INFO: Reading rc options for 'build' from /home/davido/projects/gerrit/.bazelrc:
Inherited 'common' options: --noenable_bzlmod
INFO: Reading rc options for 'build' from /home/davido/projects/gerrit/.bazelrc:
'build' options: --workspace_status_command=python3 ./tools/workspace_status.py --repository_cache=~/.gerritcodereview/bazel-cache/repository --action_env=PATH --disk_cache=~/.gerritcodereview/bazel-cache/cas --java_language_version=17 --java_runtime_version=remotejdk_17 --tool_java_language_version=17 --tool_java_runtime_version=remotejdk_17 --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --incompatible_strict_action_env --announce_rc
INFO: Analyzed target //tools/eclipse:classpath (0 packages loaded, 1 target configured).
ERROR: /home/davido/projects/gerrit/proto/testing/BUILD:4:14: Generating proto_library //proto/testing:test_proto failed: Failed to fetch blobs because they do not exist remotely.: 3 errors during bulk transfer:
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 74c97c32ccbc58b7d77ca61e6ec0d576d9f47173b3360c4f31e73a265162cd1f/4388096 for bazel-out/k8-opt-exec-ST-13d3ddad9198/bin/external/com_google_protobuf/protoc
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 74c97c32ccbc58b7d77ca61e6ec0d576d9f47173b3360c4f31e73a265162cd1f/4388096 for bazel-out/k8-opt-exec-ST-13d3ddad9198/bin/external/com_google_protobuf/protoc
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 74c97c32ccbc58b7d77ca61e6ec0d576d9f47173b3360c4f31e73a265162cd1f/4388096 for bazel-out/k8-opt-exec-ST-13d3ddad9198/bin/external/com_google_protobuf/protoc
Target //tools/eclipse:classpath failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.845s, Critical Path: 0.28s
INFO: 9 processes: 4 internal, 2 linux-sandbox, 3 worker.
ERROR: Build did NOT complete successfully
Good catch @davido, I truly believe that Bazel keeps some local reference on the disk cache that it was populated from a remote source. When you do not specify the remote source anymore in the subsequent commands, Bazel blows up with the error you've shown, which is misleading because it isn't really a network transfer problem at all.
I wrongly assumed that we had issues with our remote cache storage, but that wasn't the case.
Thanks for the repro! I am looking into the issue now.
I understand the issue now. Since 7.0.0, Bazel uses --remote_download_toplevel
by default which means intermediate outputs will not be downloaded during the build.
Looking at the error builds in the CI, the scenario might be:
bazel-out/k8-fastbuild/bin/external/sshd-sftp/jar/_ijar/jar/sshd-sftp/jar/sshd-sftp-2.10.0-ijar.jar
due to --remote_download_toplevel
. Both disk cache and Bazel's output tree are not populated with this file. However, the action result is downloaded and stored in the disk cache.CacheNotFoundException
.For the repro, wiping out the disk cache could also trigger the error for the same reason: Bazel didn't download outputs during last build, when it needs the output now but fails to "download" from disk cache, it reports CacheNotFoundException
.
Internally, Bazel indeed keeps some references to the disk or remote cache because when building with -remote_download_[toplevel|minimal]
, Bazel won't download some of the outputs. It only remember the metadata so that the outputs can be re-downloaded later.
From the CI setup, it seems that you want to populate the disk cache using remote cache during the first build. If so, I would suggest setting --remote_download_all
for the first build. Otherwise, --experimental_remote_cache_eviction_retries
is the right flag for this issue.
This is more like a documentation issue, not a real bug in Bazel. Downgrading the priority.
This is more like a documentation issue, not a real bug in Bazel. Downgrading the priority.
Should this be considered a breaking change in Bazel 7 compared to 6? I guess the default behaviour has changed in a non-backward compatible way. Thanks for the suggestions, I am adding the --remote_download_all
in the initial build so that all the remote resources needed are loaded locally.
That doesn't impact our build time because we always start the build with a pre-warmed Docker image that has an initial build completed. I have actually noticed that the image built was very small compared to the previous releases, which means that a lot of data was not stored anymore locally.
I agree to downgrading to a P2.
Should this be considered a breaking change in Bazel 7 compared to 6?
Yes, it's a breaking change. It is highlighted in the release notes: https://blog.bazel.build/2023/12/11/bazel-7-release.html#build-without-the-bytes-bwob, we probably should've made it more clear that it's a breaking change.
I'm trying to upgrade bazel 7.0 in our iOS project. All things work fine in bazel 6.3.2.
But when I upgraded bazel to 7.0, I also met the same issue. As mentioned above, it seems that this problem occurs when both disk and remote cache are used. But I'm pretty sure I'm not using disk cache and RBE.
Here is the outputs:
'build' options: --verbose_failures --announce_rc --apple_platform_type=ios --show_progress_rate_limit=5 --output_filter=^$ --ios_minimum_os=11.0 --macos_minimum_os=12.0 --host_macos_minimum_os=12.0 --use_top_level_targets_for_symlinks --incompatible_strict_action_env --define=apple.compress_ipa=true --experimental_cc_implementation_deps --experimental_guard_against_concurrent_changes --profile=bazel-profile --experimental_objc_include_scanning --experimental_remote_cache_compression --features=oso_prefix_is_pwd --features=layering_check --features=swift.skip_function_bodies_for_derived_files --features=swift.minimal_deps --features=swift.layering_check --features=swift.module_map_no_private_headers --remote_timeout=100s --reuse_sandbox_directories --spawn_strategy=local --genrule_strategy=local
INFO: Reading rc options for 'build' from /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/ci.bazelrc:
'build' options: --objc_enable_binary_stripping --objc_generate_linkmap --strip=always --apple_generate_dsym --remote_local_fallback --local_cpu_resources=HOST_CPUS*.9 --features=swift.use_explicit_swift_module_map --remote_cache=http://my-remote-cache.co/ios
INFO: Found applicable config definition build:strict in file /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/rules.bazelrc: --copt=-Werror
Computing main repo mapping:
Loading:
Loading: 0 packages loaded
Analyzing: target //srcs:app (0 packages loaded, 0 targets configured)
Analyzing: target //srcs:app (0 packages loaded, 0 targets configured)
[0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt
INFO: Analyzed target //srcs:app (0 packages loaded, 0 targets configured).
[9,975 / 27,591] AssetCatalogCompile srcs/app-intermediates/xcassets; 4s local ... (55 actions, 1 running)
[17,013 / 30,781] AssetCatalogCompile srcs/app-intermediates/xcassets; 9s local ... (55 actions, 1 running)
[23,414 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 14s local ... (49 actions, 1 running)
[25,897 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 19s local ... (48 actions, 1 running)
[28,437 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 24s local ... (45 actions, 1 running)
[30,608 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 29s local ... (44 actions, 1 running)
[32,933 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 34s local ... (49 actions, 1 running)
ERROR: /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/srcs/BUILD:601:16: SwiftStdlibCopy srcs/app-intermediates/swiftlibs failed: Failed to fetch blobs because they do not exist remotely.: Missing digest: f5f2f1aa89a7d08abd93a7b1a2a21a6621b01a93314b40360c5bd1c44e6e2cb3/271080288 for bazel-out/ios_arm64-opt-ios-arm64-min11.0-applebin_ios-ST-ae93c8b2d27f/bin/srcs/app_bin
ERROR: /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/srcs/BUILD:601:16: SwiftStdlibCopy srcs/app-intermediates/swiftlibs_for_swiftsupport failed: Failed to fetch blobs because they do not exist remotely.: Missing digest: f5f2f1aa89a7d08abd93a7b1a2a21a6621b01a93314b40360c5bd1c44e6e2cb3/271080288 for bazel-out/ios_arm64-opt-ios-arm64-min11.0-applebin_ios-ST-ae93c8b2d27f/bin/srcs/app_bin
Target //srcs:app failed to build
--remote_download_all
worked for me, but --experimental_remote_cache_eviction_retries=5
didn't work.
I believe it has something to do with BwoB. But I have no idea why this happened without using disk cache.
Additional notes: I'm using a no-remote
tag in my top-level target:
ios_application(
name = "app",
...
tags = ["no-remote"],
)
I'm trying to upgrade bazel 7.0 in our iOS project. All things work fine in bazel 6.3.2.
But when I upgraded bazel to 7.0, I also met the same issue. As mentioned above, it seems that this problem occurs when both disk and remote cache are used. But I'm pretty sure I'm not using disk cache and RBE.
Here is the outputs:
'build' options: --verbose_failures --announce_rc --apple_platform_type=ios --show_progress_rate_limit=5 --output_filter=^$ --ios_minimum_os=11.0 --macos_minimum_os=12.0 --host_macos_minimum_os=12.0 --use_top_level_targets_for_symlinks --incompatible_strict_action_env --define=apple.compress_ipa=true --experimental_cc_implementation_deps --experimental_guard_against_concurrent_changes --profile=bazel-profile --experimental_objc_include_scanning --experimental_remote_cache_compression --features=oso_prefix_is_pwd --features=layering_check --features=swift.skip_function_bodies_for_derived_files --features=swift.minimal_deps --features=swift.layering_check --features=swift.module_map_no_private_headers --remote_timeout=100s --reuse_sandbox_directories --spawn_strategy=local --genrule_strategy=local INFO: Reading rc options for 'build' from /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/ci.bazelrc: 'build' options: --objc_enable_binary_stripping --objc_generate_linkmap --strip=always --apple_generate_dsym --remote_local_fallback --local_cpu_resources=HOST_CPUS*.9 --features=swift.use_explicit_swift_module_map --remote_cache=http://my-remote-cache.co/ios INFO: Found applicable config definition build:strict in file /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/rules.bazelrc: --copt=-Werror Computing main repo mapping: Loading: Loading: 0 packages loaded Analyzing: target //srcs:app (0 packages loaded, 0 targets configured) Analyzing: target //srcs:app (0 packages loaded, 0 targets configured) [0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt INFO: Analyzed target //srcs:app (0 packages loaded, 0 targets configured). [9,975 / 27,591] AssetCatalogCompile srcs/app-intermediates/xcassets; 4s local ... (55 actions, 1 running) [17,013 / 30,781] AssetCatalogCompile srcs/app-intermediates/xcassets; 9s local ... (55 actions, 1 running) [23,414 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 14s local ... (49 actions, 1 running) [25,897 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 19s local ... (48 actions, 1 running) [28,437 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 24s local ... (45 actions, 1 running) [30,608 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 29s local ... (44 actions, 1 running) [32,933 / 33,172] AssetCatalogCompile srcs/app-intermediates/xcassets; 34s local ... (49 actions, 1 running) ERROR: /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/srcs/BUILD:601:16: SwiftStdlibCopy srcs/app-intermediates/swiftlibs failed: Failed to fetch blobs because they do not exist remotely.: Missing digest: f5f2f1aa89a7d08abd93a7b1a2a21a6621b01a93314b40360c5bd1c44e6e2cb3/271080288 for bazel-out/ios_arm64-opt-ios-arm64-min11.0-applebin_ios-ST-ae93c8b2d27f/bin/srcs/app_bin ERROR: /Volumes/workspace/grunner/builds/Hwyyfv8c/0/ios/loktar/srcs/BUILD:601:16: SwiftStdlibCopy srcs/app-intermediates/swiftlibs_for_swiftsupport failed: Failed to fetch blobs because they do not exist remotely.: Missing digest: f5f2f1aa89a7d08abd93a7b1a2a21a6621b01a93314b40360c5bd1c44e6e2cb3/271080288 for bazel-out/ios_arm64-opt-ios-arm64-min11.0-applebin_ios-ST-ae93c8b2d27f/bin/srcs/app_bin Target //srcs:app failed to build
--remote_download_all
worked for me, but--experimental_remote_cache_eviction_retries=5
didn't work. I believe it has something to do with BwoB. But I have no idea why this happened without using disk cache.Additional notes: I'm using a
no-remote
tag in my top-level target:ios_application( name = "app", ... tags = ["no-remote"], )
passing --experimental_remote_downloader_local_fallback
also helps
@coeuvre Just ran into this with bazel run -c opt //src/java_tools/buildjar/java/com/google/devtools/build/java/turbine:turbine_benchmark --disk_cache=some/path
, which worked in the past and only uses --disk_cache
internally. It changes the value to a special directory it creates and then reproducibly runs into the "Missing digest" error. This seems like more than a documentation issue.
Just +1 that im seeing a similar issue:
11:01:10 ERROR: Foo/BUILD.bazel:11:15: Compiling Foo.c failed: unable to finalize action: Missing digest: <number>/<number> for bazel-out/ios_arm64-opt-ios-arm64-min12.0-applebin_ios-ST-<sha>/bin/path/to/Foo.d
Our setup is a bit different though as were testing with 7.1.1 and:
--remote_download_outputs="all"
How can we have issues downloading here since BwtB is disabled?
Description of the bug:
Gerrit Code Review is in process of upgrading to bazel 7.0.0.
All was fine after the upgrade to 7.0.0rc2, see: [1].
However, after upgrading to the 7.0.0rc3 we started to see this breakage on our CI:
https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-chrome-latest/40214/console
If I downgrade to 7.0.0.rc2, then the build is successful again: [1]
[1] https://gerrit-review.googlesource.com/c/gerrit/+/391534
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I cannot currently reproduce the problem locally ;-(
This command is invoked on the CI:
That is created a shell script and invoking it to publish Plugin API artifacts in the local maven repository.
Which operating system are you running Bazel on?
Linux
What is the output of
bazel info release
?7.0.0rc3
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
All is fine on Bazel 7.0.0.rc2. I am unable to reproduce the problem locally and this cannot bisect.
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response