rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

v2.11 bazel error #define "ROCRAND_VERSION" is either ... #49

Closed kazdam closed 1 year ago

kazdam commented 1 year ago

Hi I'm trying to build this AUR and it appears it is not happy with bazel 5.4.0. I have the dependencies already installed and it would appear that /opt/rocm/include/rocrand/rocrand_version.h does define the following. Any ideas?

define ROCRAND_VERSION 201009

$ yay tensorflow-rocm
2 aur/tensorflow-rocm 2.11.0-1 (+9 1.10) 
    Library for computation using data flow graphs for scalable machine learning (with ROCM)
1 aur/python-tensorflow-rocm 2.11.0-1 (+9 1.10) 
    Library for computation using data flow graphs for scalable machine learning (with ROCM)
==> Packages to install (eg: 1 2 3, 1-3 or ^4)
==> :: Checking for conflicts...
:: Checking for inner conflicts...
 -> Package conflicts found:
 -> Installing python-tensorflow-rocm will remove: python-tensorflow-opt-cuda (python-tensorflow)
 -> Installing tensorflow-rocm will remove: tensorflow-opt-cuda (tensorflow)
 -> Conflicting packages will have to be confirmed manually
[Aur:1]  tensorflow-rocm-2.11.0-1 (python-tensorflow-rocm)
[Aur Make:1]  tensorflow-rocm-2.11.0-1 (tensorflow-rocm)

:: Remove make dependencies after install? [y/N]   1 tensorflow-rocm (tensorflow-rocm python-tensorflow-rocm) (Build Files Exist)
==> Packages to cleanBuild?
==> [N]one [A]ll [Ab]ort [I]nstalled [No]tInstalled or (1 2 3, 1-3, ^4)
==> :: PKGBUILD up to date, Skipping (1/0): tensorflow-rocm
  1 tensorflow-rocm (tensorflow-rocm python-tensorflow-rocm) (Build Files Exist)
==> Diffs to show?
==> [N]one [A]ll [Ab]ort [I]nstalled [No]tInstalled or (1 2 3, 1-3, ^4)
==> :: (1/1) Parsing SRCINFO: tensorflow-rocm (tensorflow-rocm python-tensorflow-rocm)
==> Making package: tensorflow-rocm 2.11.0-1 (Mon 27 Feb 2023 08:24:51 PM EST)
==> Retrieving sources...
  -> Found tensorflow-rocm-2.11.0.tar.gz
  -> Found bazel_nojdk-5.4.0-linux-x86_64
  -> Found fix-c++17-compat.patch
  -> Found fix-cusolver-version.patch
==> Validating source files with sha512sums...
    tensorflow-rocm-2.11.0.tar.gz ... Passed
    bazel_nojdk-5.4.0-linux-x86_64 ... Passed
    fix-c++17-compat.patch ... Passed
    fix-cusolver-version.patch ... Passed
 -> tensorflow-rocm not satisfied, flushing install queue
==> Making package: tensorflow-rocm 2.11.0-1 (Mon 27 Feb 2023 08:24:54 PM EST)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> Retrieving sources...
  -> Found tensorflow-rocm-2.11.0.tar.gz
  -> Found bazel_nojdk-5.4.0-linux-x86_64
  -> Found fix-c++17-compat.patch
  -> Found fix-cusolver-version.patch
==> Validating source files with sha512sums...
    tensorflow-rocm-2.11.0.tar.gz ... Passed
    bazel_nojdk-5.4.0-linux-x86_64 ... Passed
    fix-c++17-compat.patch ... Passed
    fix-cusolver-version.patch ... Passed
==> Removing existing $srcdir/ directory...
==> Extracting sources...
  -> Extracting tensorflow-rocm-2.11.0.tar.gz with bsdtar
  -> Extracting bazel_nojdk-5.4.0-linux-x86_64 with bsdtar
==> Starting prepare()...
bazel 5.4.0
patching file third_party/gpus/cuda_configure.bzl
Hunk #1 succeeded at 715 (offset 5 lines).
==> Sources are ready.
==> Making package: tensorflow-rocm 2.11.0-1 (Mon 27 Feb 2023 08:25:03 PM EST)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> WARNING: Using existing $srcdir/ tree
==> Removing existing $pkgdir/ directory...
==> Starting build()...
Building with rocm and without non-x86-64 optimizations
You have bazel 6.0.0 installed.
Please specify the location of python. [Default is /usr/bin/python3]: 

Found possible Python library paths:
  /usr/lib/python3.10/site-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3.10/site-packages]
Do you wish to download a fresh release of clang? (Experimental) [y/N]: Clang will not be downloaded.

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=mkl_aarch64    # Build with oneDNN and Compute Library for the Arm Architecture (ACL).
    --config=monolithic     # Config for mostly static monolithic build.
    --config=numa           # Build with NUMA support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
    --config=v1             # Build with TensorFlow 1 API instead of TF 2 API.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=nogcp          # Disable GCP support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished
Starting local Bazel server and connecting to it...
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false
INFO: Reading rc options for 'build' from /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/usr/bin/python3 --action_env PYTHON_LIB_PATH=/usr/lib/python3.10/site-packages --python_path=/usr/bin/python3 --config=rocm
INFO: Reading rc options for 'build' from /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc:
  'build' options: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_jitrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/tfrt/common,tensorflow/core/tfrt/eager,tensorflow/core/tfrt/eager/backends/cpu,tensorflow/core/tfrt/eager/backends/gpu,tensorflow/core/tfrt/eager/core_runtime,tensorflow/core/tfrt/eager/cpp_tests/core_runtime,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils
INFO: Found applicable config definition build:short_logs in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:rocm in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc: --crosstool_top=@local_config_rocm//crosstool:toolchain --define=using_rocm_hipcc=true --define=tensorflow_mkldnn_contraction_kernel=0 --repo_env TF_NEED_ROCM=1 --copt=-Wno-error=unused-result
INFO: Found applicable config definition build:linux in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-unknown-warning --copt=-Wno-array-parameter --copt=-Wno-stringop-overflow --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --distinct_host_configuration=false --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
Loading: 
Loading: 0 packages loaded
INFO: Repository local_config_rocm instantiated at:
  /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/WORKSPACE:15:14: in <toplevel>
  /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/tensorflow/workspace2.bzl:928:19: in workspace
  /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/tensorflow/workspace2.bzl:99:19: in _tf_toolchains
Repository rule rocm_configure defined at:
  /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl:888:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
   Traceback (most recent call last):
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
        _create_local_rocm_repository(repository_ctx)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
        rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
        config = find_rocm_config(repository_ctx, find_rocm_config_script)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
        exec_result = _exec_find_rocm_config(repository_ctx, script_path)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
        return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
        fail(
Error in fail: Repository command failed
ERROR: #define "ROCRAND_VERSION" is either
  not present in file /opt/rocm/rocrand/include/rocrand_version.h OR
  its value is not an integer literal
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/WORKSPACE:15:14: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
        _create_local_rocm_repository(repository_ctx)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
        rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
        config = find_rocm_config(repository_ctx, find_rocm_config_script)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
        exec_result = _exec_find_rocm_config(repository_ctx, script_path)
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
        return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
    File "/home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
        fail(
Error in fail: Repository command failed
ERROR: #define "ROCRAND_VERSION" is either
  not present in file /opt/rocm/rocrand/include/rocrand_version.h OR
  its value is not an integer literal
ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: #define "ROCRAND_VERSION" is either
  not present in file /opt/rocm/rocrand/include/rocrand_version.h OR
  its value is not an integer literal
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: #define "ROCRAND_VERSION" is either
  not present in file /opt/rocm/rocrand/include/rocrand_version.h OR
  its value is not an integer literal
INFO: Elapsed time: 2.679s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
==> ERROR: A failure occurred in build().
    Aborting...
 -> error making: tensorflow-rocm (tensorflow-rocm python-tensorflow-rocm)
acxz commented 1 year ago

Can reproduce

acxz commented 1 year ago

Resolved with 18115b8