Open yeounoh opened 2 years ago
@yeounoh Sorry, we don't have the capacity to help you debug this since we cannot reproduce it. (bazel build //tensorflow/tools/pip_package:build_pip_package
works for me locally). Would bazel clean --expunge
help you?
hi @meteorcloudy, thanks -- I've tried bazel clean --expunge
but it's the same. One thing I noticed is that if I change the spawn_strategy
to sandboxed
it works on my local machine (but, it still fails in our CI/VM build). Could this be a useful hint?
You probably should check where do files like 'bazel-out/k8-opt-exec-50AE0418/bin/external/llvm-project/llvm/config.cppmap'
come from and use bazel cquery
to check why are they not in the dependencies.
I'm running into this too.
I have the following environment variables set: CC=/usr/bin/clang
, CXX=/usr/bin/clang++
.
Unsetting these two environment variables removed the issue.
FYI @oquenchil Seems like Bazel has some C++ module issue when building with clang.
Using the nightly (pre-release) resolved the issue for me.
I am seeing the failure again, will reopen the issue.
@bjacob it worked for me as well. Not sure why setting CC & CXX would cause the issue, though 🤷
At one point (https://github.com/bazelbuild/bazel/issues/13135) --spawn_strategy=sandboxed was required due to zombie state hanging around (to paraphrase from there), potentially setting environment flags just resulted in avoiding some of that.
The issue seems to be that while the .cppmap
files are actually included in the dependencies of the compile action (as seen via aquery
), something else in there is upset about it. I think what's going on here is that the strict check is not recognizing those files as being headers.
This bug is also breaking the bazel build of jaxlib on x86_64-darwin with clang (https://github.com/NixOS/nixpkgs/pull/183051#issuecomment-1226635146).
Anyone have a small repro I can run through the debugger? Tensorflow is a bit big and I'm not very familiar with c++, so not sure I would be able to extract a small repro from it.
add --spawn_strategy=sandboxed
solved this problem for me
According to my experience, clang version <=12 can avoid this issue but clang version >= 15 will reproduce it. May this one can raise some hints.
Update:remove bazel feature layer_check also works.
Same issue trying to depend on boost using https://github.com/nelhage/rules_boost and --spawn_strategy=sandboxed
does not help.
These look like undeclared inclusions thrown intentionally by the layering check. If you are affected you should either fix those errors by adding the required dependencies to the cc_library
target or disable the layering check with --features=-layering_check
passed on the command line. This is not a Bazel bug as far as I can tell.
Please feel free to reopen providing the exact compilation error, the Bazel build target as listed on the BUILD file and the contents of the *.cc
file whose compilation is throwing the error. I'd expect that there is an #include header
in the source file for which there isn't a direct dependency in the build target providing that header.
I just faced the same issue with gtest and wrote a minimal repro: https://github.com/hypdeb/missing-deps.
Looking at the BUILD
file in gtest
we can see that the headers are in fact included: https://github.com/google/googletest/blob/455fcb7773dedc70ab489109fb12d8abc7fd59b6/BUILD.bazel#L86
and exist:
https://github.com/google/googletest/tree/main/googletest/include/gtest/internal
@oquenchil Removing layering check does not solve the issue.
I ran a further experiment and building gtest
itself with my toolchain fails. This means the issue I'm facing is a different one as it's not related to transitive dependencies. Please disregard my comments above.
If anyone ends up here with my issue anyways, it was solved by adding the following linker flags:
"-no-canonical-prefixes",
"-L/usr/local/llvm/lib",
to my toolchain.
Original issue here likely fixed by https://github.com/bazelbuild/bazel/pull/21832, please verify with 7.3.0rc1
Description of the bug:
I am building tensorflow project (commit:
2c6d3ed00f16838831aa460c5668a8466b9f3649
) and running into errors about the missing dependency declarations.For instance, here is one of the errors:
And here is the corresponding build def (from the build cache):
It doesn't include the missing depndencies, but just
@llvm-project/llvm:Support
; however, the build def of@llvm-project/llvm:Support
does contain the missing dependency declarations (so it built successfully, too):If I manually add the missing deps directly to the
@llvm-project/mlirSupport
build def, then I can make it work (but it will run into other similar issues; repeat). I think there is something wrong with my setting that prevents transitive dependency.What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Checkout https://github.com/tensorflow/tensorflow.git, commit:
2c6d3ed00f16838831aa460c5668a8466b9f3649
.Try:
bazel build //tensorflow/tools/pip_package:build_pip_package
I asked my colleagues to try and some have and some don't have the issue.
Which operating system are you running Bazel on?
Debian GNU/Linux rodete, Linux 5.15.15-1rodete2-amd64, x86-64
What is the output of
bazel info release
?INFO: Options provided by the client: Inherited 'common' options: --isatty=1 --terminal_columns=90 INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: Inherited 'common' options: --experimental_repo_remote_exec INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: Inherited 'build' options: --define framework_shared_object=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.tf_configure.bazelrc: Inherited 'build' options: --action_env PYTHON_BIN_PATH=/usr/local/google/home/yeounoh/anaconda3/envs/torch-xla-1.11/bin/python3 --action_env PYTHON_LIB_PATH=/usr/local/google/home/yeounoh/anaconda3/envs/torch-xla-1.11/lib/python3.8/site-packages --python_path=/usr/local/google/home/yeounoh/anaconda3/envs/torch-xla-1.11/bin/python3 INFO: Reading rc options for 'info' from /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: Inherited 'build' options: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/tfrt/common,tensorflow/core/tfrt/eager,tensorflow/core/tfrt/eager/backends/cpu,tensorflow/core/tfrt/eager/backends/gpu,tensorflow/core/tfrt/eager/core_runtime,tensorflow/core/tfrt/eager/cpp_tests/core_runtime,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils INFO: Found applicable config definition build:short_logs in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --output_filter=DONT_MATCH_ANYTHING INFO: Found applicable config definition build:v2 in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1 INFO: Found applicable config definition build:linux in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --copt=-w --host_copt=-w --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++14 --host_cxxopt=-std=c++14 --config=dynamic_kernels --distinct_host_configuration=false --experimental_guard_against_concurrent_changes INFO: Found applicable config definition build:dynamic_kernels in file /usr/local/google/home/yeounoh/git/pytorch/xla/third_party/tensorflow/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS release 5.1.1
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response