Closed acxz closed 1 year ago
As of today (24th Feb. 2023), with the last package update, I see errors similar to those reported here (concerning rocblas), despite the package itself (rocblas5.4.3-1) being installed. The errors are:
/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl:888:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
Traceback (most recent call last):
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx, find_rocm_config_script)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
exec_result = _exec_find_rocm_config(repository_ctx, script_path)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: /home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/WORKSPACE:15:14: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 869, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 547, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 395, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx, find_rocm_config_script)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 373, column 41, in find_rocm_config
exec_result = _exec_find_rocm_config(repository_ctx, script_path)
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/gpus/rocm_configure.bzl", line 369, column 19, in _exec_find_rocm_config
return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
File "/home/X/.cache/yay/tensorflow-rocm/src/tensorflow-2.11.0-rocm/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: rocblas version file not found in ['rocblas/include/rocblas-version.h', 'rocblas/include/internal/rocblas-version.h']
INFO: Elapsed time: 3,428s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
currently loading: tensorflow ... (2 packages)
==> ERROR: Se produjo un fallo en build().
Cancelando...
-> error compilando: tensorflow-rocm
After checking the last updates and the code in the tensorflow repo, it seems that there was some reorganising of files of ROCm and now the rocblas-version.h file (the one that blaze cannot find) is in /opt/rocm/include/rocblas/internal, instead of /opt/rocm/rocblas/include/internal.
This seems to not be fixed in tensorflowv2.11.0, but it has been solved in more recent ones, see the two lines from v2.11 and master:
Maybe a more recent tensorflow is needed for compiling it along the latest rocm versions :/