Closed Rocketknight1 closed 2 years ago
Hi, I ran into this as well while trying to get TF3D to work with CUDA 11 and an RX 3090.
I managed to get it compiling by manually changing the .bazelrc file that gets generated by the configure.sh script.
For reference I changed mine to:
build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true
build:manylinux2010cuda11 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010-nvcc-cuda11:toolchain
build --spawn_strategy=standalone
build --strategy=Genrule=standalone
build -c opt
build --action_env TF_HEADER_DIR="/usr/local/lib/python3.6/dist-packages/tensorflow/include"
build --action_env TF_SHARED_LIBRARY_DIR="/usr/local/lib/python3.6/dist-packages/tensorflow"
build --action_env TF_SHARED_LIBRARY_NAME="libtensorflow_framework.so.2"
build --action_env TF_NEED_CUDA="1"
build --action_env TF_CUDA_VERSION="11.0"
build --action_env TF_CUDNN_VERSION="8"
build --action_env CUDNN_INSTALL_PATH="/usr/lib/x86_64-linux-gnu"
build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda"
build --config=cuda
test --config=cuda
build --config=manylinux2010cuda11
test --config=manylinux2010cuda11
Note the change in toolchain in the second line to use cuda11, also the TF_CUDA_VERSION
is set to 11.0 and the TF_CUDNN_VERSION
to 8.
I haven't tested it thoroughly yet, but I managed to get a wheel made and it is correctly importing in python.
I tried installing the tensorflow:2.4.0-custom-op-gpu-ubuntu16 image to compile an op for TF2.4, but I got the error below. It seems to be looking for a hardcoded CUDA 10.1, even though the TF2.4 in the image is compiled with CUDA 11.0.
Is there any workaround? I'm not even sure where to begin patching the code.