tensorflow / hub

A library for transfer learning by reusing parts of TensorFlow models.
https://tensorflow.org/hub
Apache License 2.0
3.49k stars 1.67k forks source link

Bug: Building tensorflow 2.15/2.16 from source is not possible : Missing tensorrt #907

Closed junapsantos closed 7 months ago

junapsantos commented 8 months ago

What happened?

I am trying to build tensorflow from source (with GPUs) since when running with tenosrflow from pip I get: To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX512_FP16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags and the code takes around 4 min to start.

I am using bazelisk to build bazel and then following the instructions from the official page: https://www.tensorflow.org/install/source#configure_the_build

When doing ./configure I specify that I do not want TensorRT.

However, whenI try to build tensorflow I get errors regarding tensorrt. I do not want to use tensorrt since it is not compatible with CUDA 12.2.

My versions: Python 3.10.12 cuda 12.2 drivers : 535 cuDNN 8.9.7 Ubunto 22.04

Relevant code

My configure options:
./configure
You have bazel 6.5.0 installed.
Please specify the location of python. [Default is /media/junasantos/7A0A2B166AF42595/Juna/virtual_env/bin/python3]: 

Found possible Python library paths:
  /media/junasantos/7A0A2B166AF42595/Juna/virtual_env/lib/python3.10/site-packages
Please input the desired Python library path to use.  Default is [/media/junasantos/7A0A2B166AF42595/Juna/virtual_env/lib/python3.10/site-packages]

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.

Found CUDA 12.2 in:
    /usr/local/cuda-12.2/targets/x86_64-linux/lib
    /usr/local/cuda-12.2/targets/x86_64-linux/include
Found cuDNN 8 in:
    /usr/local/cuda-12.2/targets/x86_64-linux/lib
    /usr/local/cuda-12.2/targets/x86_64-linux/include

Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Each capability can be specified as "x.y" or "compute_xy" to include both virtual and binary GPU code, or as "sm_xy" to only include the binary code.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 

Do you want to use clang as CUDA compiler? [Y/n]: y
Clang will be used as CUDA compiler.

Please specify clang path that to be used as host compiler. [Default is /usr/lib/llvm-17/bin/clang]: 

You have Clang 17.0.6 installed.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]: 

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

The errors appear when I do the following :
bazel build //tensorflow/tools/pip_package:wheel --repo_env=WHEEL_NAME=tensorflow --config=cuda

Relevant log output

WARNING: The following configs were expanded more than once: [cuda_clang, cuda, tensorrt]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
INFO: Reading 'startup' options from /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --windows_enable_symlinks
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=180
INFO: Reading rc options for 'build' from /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --features=-force_no_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility
INFO: Reading rc options for 'build' from /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/media/junasantos/7A0A2B166AF42595/Juna/virtual_env/bin/python3 --action_env PYTHON_LIB_PATH=/media/junasantos/7A0A2B166AF42595/Juna/virtual_env/lib/python3.10/site-packages --python_path=/media/junasantos/7A0A2B166AF42595/Juna/virtual_env/bin/python3 --action_env CUDA_TOOLKIT_PATH=/usr/local/cuda-12.2 --action_env TF_CUDA_COMPUTE_CAPABILITIES=3.5,7.0 --action_env LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:/usr/local/cuda-12.2/lib64:/usr/local/cuda-12.2/lib64 --config=cuda_clang --action_env CLANG_CUDA_COMPILER_PATH=/usr/lib/llvm-17/bin/clang --copt=-Wno-gnu-offsetof-extensions --config=cuda_clang
INFO: Found applicable config definition build:short_logs in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:cuda_clang in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --config=cuda --config=tensorrt --action_env=TF_CUDA_CLANG=1 --@local_config_cuda//:cuda_compiler=clang --repo_env=TF_CUDA_COMPUTE_CAPABILITIES=sm_60,sm_70,sm_80,sm_89,compute_90
INFO: Found applicable config definition build:cuda in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
INFO: Found applicable config definition build:tensorrt in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --repo_env TF_NEED_TENSORRT=1
INFO: Found applicable config definition build:cuda_clang in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --config=cuda --config=tensorrt --action_env=TF_CUDA_CLANG=1 --@local_config_cuda//:cuda_compiler=clang --repo_env=TF_CUDA_COMPUTE_CAPABILITIES=sm_60,sm_70,sm_80,sm_89,compute_90
INFO: Found applicable config definition build:cuda in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
INFO: Found applicable config definition build:tensorrt in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --repo_env TF_NEED_TENSORRT=1
INFO: Found applicable config definition build:cuda in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
INFO: Found applicable config definition build:linux in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --copt=-Wswitch --copt=-Werror=switch --copt=-Wno-error=unused-but-set-variable --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
WARNING: The following configs were expanded more than once: [cuda_clang, cuda, tensorrt]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
INFO: Repository local_config_tensorrt instantiated at:
  /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/WORKSPACE:86:14: in <toplevel>
  /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/tensorflow/workspace2.bzl:928:19: in workspace
  /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/tensorflow/workspace2.bzl:107:23: in _tf_toolchains
Repository rule tensorrt_configure defined at:
  /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/tensorrt/tensorrt_configure.bzl:320:37: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_tensorrt':
   Traceback (most recent call last):
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/tensorrt/tensorrt_configure.bzl", line 300, column 38, in _tensorrt_configure_impl
                _create_local_tensorrt_repository(repository_ctx)
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/tensorrt/tensorrt_configure.bzl", line 159, column 30, in _create_local_tensorrt_repository
                config = find_cuda_config(repository_ctx, ["cuda", "tensorrt"])
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/gpus/cuda_configure.bzl", line 693, column 26, in find_cuda_config
                exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_cuda_config] + cuda_libraries)
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
                fail(
Error in fail: Repository command failed
Could not find any NvInferVersion.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
        'local/cuda/extras/CUPTI/include'
        'targets/x86_64-linux/include'
of:
        '/lib'
        '/lib/i386-linux-gnu'
        '/lib/x86_64-linux-gnu'
        '/lib32'
        '/usr'
        '/usr/lib/x86_64-linux-gnu/libfakeroot'
        '/usr/local/cuda'
        '/usr/local/cuda/targets/x86_64-linux/lib'
ERROR: /media/junasantos/7A0A2B166AF42595/Juna/tensorflow/WORKSPACE:86:14: fetching tensorrt_configure rule //external:local_config_tensorrt: Traceback (most recent call last):
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/tensorrt/tensorrt_configure.bzl", line 300, column 38, in _tensorrt_configure_impl
                _create_local_tensorrt_repository(repository_ctx)
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/tensorrt/tensorrt_configure.bzl", line 159, column 30, in _create_local_tensorrt_repository
                config = find_cuda_config(repository_ctx, ["cuda", "tensorrt"])
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/gpus/cuda_configure.bzl", line 693, column 26, in find_cuda_config
                exec_result = execute(repository_ctx, [python_bin, repository_ctx.attr._find_cuda_config] + cuda_libraries)
        File "/media/junasantos/7A0A2B166AF42595/Juna/tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
                fail(
Error in fail: Repository command failed
Could not find any NvInferVersion.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
        'local/cuda/extras/CUPTI/include'
        'targets/x86_64-linux/include'
of:
        '/lib'
        '/lib/i386-linux-gnu'
        '/lib/x86_64-linux-gnu'
        '/lib32'
        '/usr'
        '/usr/lib/x86_64-linux-gnu/libfakeroot'
        '/usr/local/cuda'
        '/usr/local/cuda/targets/x86_64-linux/lib'
ERROR: Skipping '//tensorflow/tools/pip_package:wheel': no such package '@local_config_tensorrt//': Repository command failed
Could not find any NvInferVersion.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
        'local/cuda/extras/CUPTI/include'
        'targets/x86_64-linux/include'
of:
        '/lib'
        '/lib/i386-linux-gnu'
        '/lib/x86_64-linux-gnu'
        '/lib32'
        '/usr'
        '/usr/lib/x86_64-linux-gnu/libfakeroot'
        '/usr/local/cuda'
        '/usr/local/cuda/targets/x86_64-linux/lib'
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_tensorrt//': Repository command failed
Could not find any NvInferVersion.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
        'local/cuda/extras/CUPTI/include'
        'targets/x86_64-linux/include'
of:
        '/lib'
        '/lib/i386-linux-gnu'
        '/lib/x86_64-linux-gnu'
        '/lib32'
        '/usr'
        '/usr/lib/x86_64-linux-gnu/libfakeroot'
        '/usr/local/cuda'
        '/usr/local/cuda/targets/x86_64-linux/lib'
INFO: Elapsed time: 2.754s

tensorflow_hub Version

other (please specify)

TensorFlow Version

other (please specify)

Other libraries

My versions: Python 3.10.12 cuda 12.2 drivers : 535 cuDNN 8.9.7 Ubunto 22.04

I can find and use gpus with tensorflow from pip install tensorflow.

Python Version

3.x

OS

Linux

WGierke commented 7 months ago

This repository concerns the tensorflow_hub library, not the tensorflow library. For the latter, please ask at https://github.com/tensorflow/tensorflow.

google-ml-butler[bot] commented 7 months ago

Are you satisfied with the resolution of your issue? Yes No