rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

rocblas version file not found #40

Closed acxz closed 1 year ago

acxz commented 2 years ago
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/tensorflow/runtime/archive/093ed77f7d50f75b376f40a71ea86e08cedb8b80.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
INFO: Repository local_config_rocm instantiated at:
  /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/WORKSPACE:15:14: in <toplevel>
  /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/tensorflow/workspace2.bzl:870:19: in workspace
  /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/tensorflow/workspace2.bzl:102:19: in _tf_toolchains
Repository rule rocm_configure defined at:
  /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl:888:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
   Traceback (most recent call last):
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
        _create_local_rocm_repository(repository_ctx)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
        rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
        config = find_rocm_config(repository_ctx, find_rocm_config_script)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
        exec_result = _exec_find_rocm_config(repository_ctx, script_path)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
        return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
        fail(
Error in fail: Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/WORKSPACE:15:14: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
        _create_local_rocm_repository(repository_ctx)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
        rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
        config = find_rocm_config(repository_ctx, find_rocm_config_script)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
        exec_result = _exec_find_rocm_config(repository_ctx, script_path)
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
        return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
    File "/home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
        fail(
Error in fail: Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
INFO: Repository rules_cc instantiated at:
  /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/WORKSPACE:19:14: in <toplevel>
  /home/acxz/vcs/git/github/rocm-arch/tensorflow-rocm/src/tensorflow-2.9.1-rocm/tensorflow/workspace1.bzl:11:28: in workspace
  /home/acxz/.cache/bazel/_bazel_acxz/f55c36c9b610634c7e1fcf6127ca0721/external/rules_cuda/cuda/dependencies.bzl:72:18: in rules_cuda_dependencies
  /home/acxz/.cache/bazel/_bazel_acxz/f55c36c9b610634c7e1fcf6127ca0721/external/rules_cuda/cuda/dependencies.bzl:35:17: in _rules_cc
Repository rule http_archive defined at:
  /home/acxz/.cache/bazel/_bazel_acxz/f55c36c9b610634c7e1fcf6127ca0721/external/bazel_tools/tools/build_defs/repo/http.bzl:353:31: in <toplevel>
ERROR: Skipping '//tensorflow:libtensorflow.so': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
INFO: Elapsed time: 115.639s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
    currently loading: tensorflow/tools/pip_package ... (2 packages)
acxz commented 2 years ago

patched with bcad9d5

vrbouza commented 1 year ago

As of today (24th Feb. 2023), with the last package update, I see errors similar to those reported here (concerning rocblas), despite the package itself (rocblas5.4.3-1) being installed. The errors are:

  /home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl:888:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
   Traceback (most recent call last):
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
                _create_local_rocm_repository(repository_ctx)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
                rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
                config = find_rocm_config(repository_ctx, find_rocm_config_script)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
                exec_result = _exec_find_rocm_config(repository_ctx, script_path)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
                return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
                fail(
Error in fail: Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: /home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/WORKSPACE:15:14: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
                _create_local_rocm_repository(repository_ctx)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
                rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
                config = find_rocm_config(repository_ctx, find_rocm_config_script)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
                exec_result = _exec_find_rocm_config(repository_ctx, script_path)
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
                return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
        File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
                fail(
Error in fail: Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
INFO: Elapsed time: 3,428s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
    currently loading: tensorflow ... (2 packages)
==> ERROR: Se produjo un fallo en build().
    Cancelando...
 -> error compilando: tensorflow-rocm
vrbouza commented 1 year ago

After checking the last updates and the code in the tensorflow repo, it seems that there was some reorganising of files of ROCm and now the rocblas-version.h file (the one that blaze cannot find) is in /opt/rocm/include/rocblas/internal, instead of /opt/rocm/rocblas/include/internal.

This seems to not be fixed in tensorflowv2.11.0, but it has been solved in more recent ones, see the two lines from v2.11 and master:

Maybe a more recent tensorflow is needed for compiling it along the latest rocm versions :/

acxz commented 1 year ago

Nice observations @vrbouza! Updated the package and the issue should be resolved with 18115b8