rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

Build fails with "An error occurred during the fetch of repository 'local_config_rocm'" and "Cannot find rocm library amdhip64" #68

Open prmbittencourt opened 2 months ago

prmbittencourt commented 2 months ago

Trying on a fully updated EndeavourOS machine. GPU is AMD Radeon RX 6750 XT. Stable Diffusion WebUI works with my GPU, so I know that ROCm is installed properly.

INFO: Repository local_config_rocm instantiated at:
  /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/WORKSPACE:84:14: in <toplevel>
  /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/tensorflow/workspace2.bzl:918:19: in workspace
  /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/tensorflow/workspace2.bzl:112:19: in _tf_toolchains
Repository rule rocm_configure defined at:
  /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl:833:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
   Traceback (most recent call last):
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 811, column 38, in _rocm_autoconf_impl
        _create_local_rocm_repository(repository_ctx)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 600, column 27, in _create_local_rocm_repository
        rocm_libs = _find_libs(repository_ctx, rocm_config, hipfft_or_rocfft, miopen_path, rccl_path, bash_bin)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 366, column 34, in _find_libs
        return _select_rocm_lib_paths(repository_ctx, libs_paths, bash_bin)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 328, column 36, in _select_rocm_lib_paths
        auto_configure_fail("Cannot find rocm library %s" % name)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 153, column 9, in auto_configure_fail
        fail("\n%sROCm Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail: 
ROCm Configuration Error: Cannot find rocm library amdhip64
ERROR: /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/WORKSPACE:84:14: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 811, column 38, in _rocm_autoconf_impl
        _create_local_rocm_repository(repository_ctx)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 600, column 27, in _create_local_rocm_repository
        rocm_libs = _find_libs(repository_ctx, rocm_config, hipfft_or_rocfft, miopen_path, rccl_path, bash_bin)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 366, column 34, in _find_libs
        return _select_rocm_lib_paths(repository_ctx, libs_paths, bash_bin)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 328, column 36, in _select_rocm_lib_paths
        auto_configure_fail("Cannot find rocm library %s" % name)
    File "/home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/third_party/gpus/rocm_configure.bzl", line 153, column 9, in auto_configure_fail
        fail("\n%sROCm Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail: 
ROCm Configuration Error: Cannot find rocm library amdhip64
INFO: Repository rules_cc instantiated at:
  /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/WORKSPACE:88:14: in <toplevel>
  /home/paulo/.cache/yay/tensorflow-rocm/src/tensorflow-2.15.0-rocm/tensorflow/workspace1.bzl:19:28: in workspace
  /home/paulo/.cache/bazel/_bazel_paulo/806fd5b1a340c1ea45cad759f2540887/external/rules_cuda/cuda/dependencies.bzl:72:18: in rules_cuda_dependencies
  /home/paulo/.cache/bazel/_bazel_paulo/806fd5b1a340c1ea45cad759f2540887/external/rules_cuda/cuda/dependencies.bzl:35:17: in _rules_cc
Repository rule http_archive defined at:
  /home/paulo/.cache/bazel/_bazel_paulo/806fd5b1a340c1ea45cad759f2540887/external/bazel_tools/tools/build_defs/repo/http.bzl:372:31: in <toplevel>
ERROR: Skipping '//tensorflow:libtensorflow_cc.so': no such package '@local_config_rocm//rocm': 
ROCm Configuration Error: Cannot find rocm library amdhip64
ERROR: no such package '@local_config_rocm//rocm': 
ROCm Configuration Error: Cannot find rocm library amdhip64
INFO: Elapsed time: 2.381s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
    currently loading: tensorflow ... (2 packages)
==> ERROR: A failure occurred in build().
    Aborting...
 -> error making: tensorflow-rocm-exit status 4
 -> Failed to install the following packages. Manual intervention is required:
tensorflow-opt-rocm - exit status 4
SotchNam commented 2 months ago

any updates?

prmbittencourt commented 2 months ago

@SotchNam No changes, I still get the exact same error.

Melon-Bread commented 2 months ago

Have the same issue with my 6700XT

uberkael commented 1 month ago

Same here

prmbittencourt commented 1 month ago

Problem was fixed for me by using a Python 3.10 virtualenv instead of the native python 3.12.

uberkael commented 1 month ago

Didn't work for me

7t2 commented 1 month ago

On arch linux, I get the exact same error even in a python3.10 venv with TF_PYTHON_VERSION=3.10, was there anything else you did @prmbittencourt ?

The furthest I have gotten so far is to just use TF_PYTHON_VERSION=3.10 outside of any venv's, just leaving me with local_config_rocm errors

prmbittencourt commented 1 month ago

@7t2 The only thing I did different was use pyenv to set the python version in the webui's directory to 3.10.14.