Open shkarupa-alex opened 2 years ago
But if I manually replace crosstool_top value in .bazelrc with "@local_config_cuda//crosstool:toolchain" - build continues... And another error occured:
ERROR: /home/alex/jupyter/build/addons/tensorflow_addons/custom_ops/seq2seq/BUILD:7:18: Compiling tensorflow_addons/custom_ops/seq2seq/cc/kernels/beam_search_ops_gpu.cu.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF ... (remaining 61 arguments skipped)
Traceback (most recent call last):
File "external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc", line 269, in <module>
sys.exit(main())
File "external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc", line 256, in main
return InvokeNvcc(leftover, log=args.cuda_log)
File "external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc", line 207, in InvokeNvcc
nvccopts += r'-gencode=arch=compute_%s,\"code=sm_%s\" ' % (
TypeError: not all arguments converted during string formatting
It can be fixed by removing last ", capability" here https://github.com/tensorflow/addons/blob/master/build_deps/toolchains/gpu/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl#L208
And after that build will be successful.
Can you try to submit a PR?
@seanpmorgan Do we still needs these build_deps with the new 2.9 toolchain?
This is not a proper PR, so I'll just put in a patch. I don't know if this has any side-effect. I guess it will fail on the official manylinux building process.
From 2f32601be926472f142bffbe820a28d05682219a Mon Sep 17 00:00:00 2001
From: Bernhard Bermeitinger <bernhard.bermeitinger@unisg.ch>
Date: Fri, 27 May 2022 10:52:13 +0200
Subject: [PATCH] fix compilation on cuda
Signed-off-by: Bernhard Bermeitinger <bernhard.bermeitinger@unisg.ch>
---
.../crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl | 2 +-
configure.py | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/build_deps/toolchains/gpu/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl b/build_deps/toolchains/gpu/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl
index affc0be..3b5fd82 100644
--- a/build_deps/toolchains/gpu/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl
+++ b/build_deps/toolchains/gpu/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl
@@ -205,7 +205,7 @@ def InvokeNvcc(argv, log=False):
x.replace(".", "") for x in supported_cuda_compute_capabilities])
for capability in supported_cuda_compute_capabilities[:-1]:
nvccopts += r'-gencode=arch=compute_%s,\"code=sm_%s\" ' % (
- capability, capability, capability)
+ capability, capability)
if supported_cuda_compute_capabilities:
capability = supported_cuda_compute_capabilities[-1]
nvccopts += r'-gencode=arch=compute_%s,code=\"sm_%s,compute_%s\" ' % (
diff --git a/configure.py b/configure.py
index 0d65e88..24fd2d5 100644
--- a/configure.py
+++ b/configure.py
@@ -185,7 +185,7 @@ def configure_cuda():
write("build --config=cuda")
write("build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true")
write(
- "build:cuda --crosstool_top=@ubuntu20.04-gcc9_manylinux2014-cuda11.2-cudnn8.1-tensorrt7.2_config_cuda//crosstool:toolchain"
+ "build:cuda --crosstool_top=@local_config_cuda//crosstool:toolchain"
)
--
2.36.1
Save it as fix_cuda.patch
and apply it with patch -p1 -i fix_cuda.patch
.
Ugly fix on ubuntu 20.04
sudo mkdir -p /dt9/usr
sudo ln -s /usr/bin /dt9/usr/bin
Is @ubuntu20.04-gcc9_manylinux2014-cuda11.2-cudnn8.1-tensorrt7.2_config_cuda
only intended to build from a docker image ?
Is @ubuntu20.04-gcc9_manylinux2014-cuda11.2-cudnn8.1-tensorrt7.2_config_cuda only intended to build from a docker image ?
It is mainly for producing manylinux2014 compatible wheels. But as we don't wan to maintain too much build configs we rely on this.
@bhack , this issue is still not resolved. It still required to manually replace crosstool_top value in .bazelrc with "@local_config_cuda//crosstool:toolchain" I think it should be either set automatically when building outside docker or specified via args in "configure" command and documented in readme.
@shkarupa-alex It was closed automatically as connect to your PR by Github "magic" keywords..
System information
Describe the bug
I've downloaded and built from source TF 2.9.1 with GPU support. No errors. But an error occurred during tensorflow_addons (0.17.0) building from source
Code to reproduce the issue