Open serikkehva opened 1 year ago
Can you upload the build log (spack-build-out.txt
)? The error message should end with the path to this file.
I have added spack-build-out.txt. I had to try the installation one more time and the error occured in different stage of installation, nevertheless seems really similar to the previous one.
Do newer versions of bazel work for you? Is there any reason you're trying to use these older versions?
To be honest I'haven't tried them. Bazel 4.2.2 is preferred for py-horovod package. After getting problems with 4.2v I have switched to more popular (at least in the issues) versions to see if the errors match. I will give newer versions a try.
py-horovod doesn't use bazel...
These versions are more popular in the issues because they have more issues. I would suggest using the newest version you can for all packages. They usually include important bug fixes and are better tested.
I don't understands the first sentence. Bazel is in the dependency list of py-horovod and needs to be installed if you specify tf and keras frameworks.
I have tried installing bazel 5.2.0 both with java 1.8.0.352 and 11.0.17 spack-build-out_bazel5_2_0.txt spack-build-out_bazel5_2_0_java11.txt
What I'm saying is that bazel is not a direct dependency of horovod, it's TF and Keras that need it. And TF builds fine with bazel 5.1.1, so I don't understand why you would want to build bazel 3 or 4.
From the error logs:
ERROR: An error occurred during the fetch of repository 'remotejdk11_linux':
Traceback (most recent call last):
File "/tmp/bazel_kw0Daf3Q/out/external/bazel_tools/tools/build_defs/repo/http.bzl", line 100, column 45, in _http_archive_impl
download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: No space left on device
This is probably your issue. Can you try changing your build_stage to a filesystem with more storage? Also note that Bazel will crash on NFS, so be careful which filesystem you choose. Not sure if this will help with TensorFlow where we build in tempfile.mkdtemp()
. But it's a start at least.
Changing build_stage unfortunately didn't help but setting TMPDIR did. I managed to install bazel 5.2.0 with defining both build_stage and tmpdir to some custom directories.
Glad you got it working! Not sure if there's a better way to choose the default TMPDIR location. We could set it to the build stage for Spack, but the NFS issue makes it tricky to choose a default.
Steps to reproduce the issue
Error message
Information on your system
Additional information
spack-build-out.txt Detailed error info (-verbose) in error message. I have tried building this with both :
and
Also I have tried several Bazel version, all resulting in similar error. @adamjstewart @aweits
General information
spack debug report
and reported the version of Spack/Python/Platformspack maintainers <name-of-the-package>
and @mentioned any maintainers