elixir-nx / xla

Pre-compiled XLA extension
Apache License 2.0
83 stars 21 forks source link

Build fails with ROCm on Gentoo Linux #81

Closed Eiji7 closed 1 month ago

Eiji7 commented 3 months ago

Hi, I have Gentoo Linux with latest updates.

I was fighting with ROCm support and ended up with this package set:

dev-libs/rccl
dev-python/numpy
dev-python/pip
dev-util/roctracer
sci-libs/hipBLAS
sci-libs/hipFFT
sci-libs/hipRAND
sci-libs/hipSOLVER
sci-libs/hipSPARSE
sci-libs/miopen
sys-devel/clang
sys-devel/gcc

with following USE flags for gcc:

 * Found these USE flags for sys-devel/gcc-13.2.1_p20240210:
 U I
 - - ada                            : Build the ADA language (GNAT) frontend
 - - cet                            : (Restricted to >=sys-devel/gcc-10)
                                      Enable support for control flow hijacking protection. On amd64, this provides Intel
                                      Control Flow Enforcement Technology (CET). On arm64, this provides Branch Target
                                      Identification (BTI) and Pointer Authentication Code (PAC) support. This is only effective
                                      on amd64 or arm64. Only provides benefits on newer CPUs. For Intel, the CPU must be at
                                      least as new as Tiger Lake. For AMD, it must be at least as new as Zen 3. This is harmless
                                      on older CPUs, but provides no benefit either. For ARM64, PAC was introduced in armv8.3-a,
                                      and BTI was introduced in armv8.5-a. When combined with USE=hardened on amd64, GCC will
                                      set -fcf-protection by default when building software. The effect is minimal on systems
                                      which do not support it, other than a possible small increase in codesize for the NOPs.
                                      The generated code is therefore compatible with i686 at the earliest. On arm64, GCC will
                                      set -mbranch-protection=standard by default when building software. 
 - - d                              : Enable support for the D programming language
 - - debug                          : Enables GCC's 'checking' facility via --enable-checking=yes,extra,rtl. This adds checks to
                                      various compiler passes for integrity and input validation. This can help catch possible
                                      miscompilations early as well as latent bugs which could become real problems in future,
                                      but at the cost of slower compile times when using GCC. Unrelated to backtraces. 
 - - default-stack-clash-protection : Build packages with stack clash protection on by default as a hardening measure. This
                                      enables -fstack-clash-protection by default which protects against large memory
                                      allocations allowing stack smashing. May cause slightly increased codesize, but modern
                                      compilers have been adapted to optimize well for this case, as this mitigation is now
                                      quite common. See
                                      https://developers.redhat.com/blog/2020/05/22/stack-clash-mitigation-in-gcc-part-3 and
                                      https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt. 
 - - default-znow                   : Request full relocation on start from ld.so by default. This sets the -z,now (BIND_NOW)
                                      flag by default on all linker invocations. By resolving all dynamic symbols at application
                                      startup, parts of the program can be made read-only as a hardening measure. This is
                                      closely related to RELRO which is also separately enabled by default. In some applications
                                      with many unresolved symbols (heavily plugin based, for example), startup time may be
                                      impacted. 
 - - doc                            : Add extra documentation (API, Javadoc, etc). It is recommended to enable per package
                                      instead of globally
 + + fortran                        : Add support for fortran
 - - go                             : Build the GCC Go language frontend.
 - - graphite                       : Add support for the framework for loop optimizations based on a polyhedral intermediate
                                      representation
 - - hardened                       : Activate default security enhancements for toolchain (gcc, glibc, binutils)
 - - jit                            : Enable libgccjit so other applications can embed gcc for Just-In-Time compilation.
 - - lto                            : Build using Link Time Optimizations (LTO). Note that GCC is always built with support for
                                      building other programs with LTO. This USE flag is for whether GCC itself is built and
                                      optimized with LTO. 
 - - modula2                        : Build the GCC Modula-2 language frontend.
 + + nls                            : Add Native Language Support (using gettext - GNU locale utilities)
 + + objc                           : Build support for the Objective C code language
 + + objc++                         : Build support for the Objective C++ language
 - - objc-gc                        : Build support for the Objective C code language Garbage Collector
 + + openmp                         : Build support for the OpenMP (support parallel computing), requires >=sys-devel/gcc-4.2
                                      built with USE="openmp"
 - - pgo                            : Build GCC using Profile Guided Optimization (PGO). GCC will build itself and then analyze
                                      the just-built binary and then rebuild itself using the data obtained from analysis of
                                      codepaths taken. It does not affect whether GCC itself supports PGO when building other
                                      software. This substantially increases the build time needed for building GCC itself. 
 + + sanitize                       : Build support for various sanitizer functions (ASAN/TSAN/etc...) to find runtime problems
                                      in applications. 
 + + ssp                            : Build packages with stack smashing protection on by default
 - - systemtap                      : enable systemtap static probe points
 - - test                           : Enable dependencies and/or preparations necessary to run tests (usually controlled by
                                      FEATURES=test but can be toggled independently)
 - - valgrind                       : Enable annotations for accuracy. May slow down runtime slightly. Safe to use even if not
                                      currently using dev-debug/valgrind
 - - vanilla                        : Do not add extra patches which change default behaviour; DO NOT USE THIS ON A GLOBAL SCALE
                                      as the severity of the meaning changes drastically
 - - vtv                            : Build support for virtual table verification (a C++ hardening feature). This does not
                                      control whether GCC defaults to using VTV> Note that actually using VTV breaks ABI and
                                      hence the whole system must be built with -fvtable-verify. 
 - - zstd                           : Enable support for ZSTD compression

and such environment variables:

export EXLA_TARGET="rocm"
export ROCM_PATH="/usr"
export XLA_BUILD="true"
export XLA_TARGET="rocm"

Regardless of what I should and can install there are lots of weird problems:

  1. TF_ROCM_AMDGPU_TARGETS is set in code without a way to change it and is set to: "gfx900,gfx906,gfx908,gfx90a,gfx1030". Not only this builds support for many GPUs which rarely is important, but also I need to edit xla source code to support new cards (my uses gfx1100)

  2. rocm_configure.bzl only in theory supports ROCM_PATH which is not /opt/rocm or /opt/rocm-version. In practice it forces some paths to be within hip and roctracer sub-directories which is not a case for installing ROCm packages in /usr like: /usr/lib64/libamdhip64.so. The file tries few path versions which is nice as long as it does not assumes putting a sub-directory. I would not be surprised if such sub-directory would have each case, but it's about 2 of 12 libs

  3. xla does not specify a dependencies list - reading all of that error messages and not ending up with a working setup is truly exhausting :face_exhaling:

  4. The only know success builds are using old gcc versions which is a serious problem on prod machines

You may have issues with newer and older versions of GCC. XLA builds are known to work with GCC versions between 7.5 and 9.3.

meanwhile emerge command returns:

# emerge -pv sys-devel/gcc:9.5.0

These are the packages that would be merged, in order:

Calculating dependencies... done!
Dependency resolution took 0.80 s (backtrack: 0/20).

!!! All ebuilds that could satisfy "sys-devel/gcc:9.5.0" have been masked.
!!! One of the following masked packages is required to complete your request:
- sys-devel/gcc-9.5.0::gentoo (masked by: package.mask)
/var/db/repos/gentoo/profiles/package.mask:
# Sam James <sam@gentoo.org> (2023-11-19)
# GCC 10 and older no longer receive upstream support or fixes for
# bugs. Please switch to a newer GCC version using gcc-config.
# The lowest supported version of GCC is GCC 11.

For more information, see the MASKED PACKAGES section in the emerge
man page or refer to the Gentoo Handbook.

Of course nobody expects support of a 14.0.1_pre* releases of GCC, but requiring at most 5 versions major versions back excluding even latest updates for 9.x branch is a critical issue for a prod machines.

Anyway, I have tried to use GCC version 8.5 as well as 13.2.1 with clang version 16 and 17, but none of them compiled successfully.

Firstly the logs before fixing rocm_configure.bzl:

$ mix compile
==> earmark_parser
Compiling 2 files (.xrl)
Compiling 1 file (.yrl)
Compiling 3 files (.erl)
Compiling 46 files (.ex)
Generated earmark_parser app
==> nimble_parsec
Compiling 4 files (.ex)
Generated nimble_parsec app
==> makeup
Compiling 15 files (.ex)
Generated makeup app
==> makeup_elixir
Compiling 6 files (.ex)
Generated makeup_elixir app
==> makeup_erlang
Compiling 4 files (.ex)
Generated makeup_erlang app
==> ex_doc
Compiling 26 files (.ex)
Generated ex_doc app
==> elixir_make
Compiling 6 files (.ex)
Generated elixir_make app
==> xla
Compiling 2 files (.ex)
Generated xla app
mkdir -p $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb && \
        cd $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb && \
        git init && \
        git remote add origin https://github.com/openxla/xla.git && \
        git fetch --depth 1 origin 771e38178340cbaaef8ff20f44da5407c15092cb && \
        git checkout FETCH_HEAD && \
        rm $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelversion
Zainicjowano puste repozytorium Gita w $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.git/
Z https://github.com/openxla/xla
 * branch            771e38178340cbaaef8ff20f44da5407c15092cb -> FETCH_HEAD
Uwaga: przełączanie na „FETCH_HEAD”.

Jesteś w stanie „odłączonego HEAD”. Możesz się rozglądać, tworzyć
eksperymentalne zmiany i je składać, i możesz odrzucić wszystkie zapisy,
które złożysz w tym stanie, bez wpływu na żadną gałąź, przełączając z powrotem na jakąś gałąź.

Jeśli chcesz utworzyć nową gałąź, która zachowa zapisy, które złożysz,
możesz to zrobić (teraz lub później) używając -c w ponownym poleceniu przełączenia.
Przykład:

  git switch -c <nazwa-nowej-gałęzi>

Lub cofnąć tę operację przez:

  git switch -

Wyłącz tę poradę ustawiając zmienną konfiguracji advice.detachedHead na false

HEAD wskazuje teraz na 771e381 [XLA:GPU] Check tensor_float_32_execution_enabled() in Triton codegen too
rm -f $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/extension && \
        ln -s "$HOME/xla/extension" $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/extension && \
        cd $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb && \
        bazel build --define "framework_shared_object=false" -c opt   --config=rocm --action_env=HIP_PLATFORM=amd --action_env=TF_ROCM_AMDGPU_TARGETS="gfx1100" //xla/extension:xla_extension && \
        mkdir -p $HOME/.cache/xla/0.6.0/cache/build/ && \
        cp -f $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/bazel-bin/xla/extension/xla_extension.tar.gz $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Reading 'startup' options from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --windows_enable_symlinks
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --features=-force_no_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility
INFO: Found applicable config definition build:short_logs in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:rocm in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --crosstool_top=@local_config_rocm//crosstool:toolchain --define=using_rocm_hipcc=true --define=tensorflow_mkldnn_contraction_kernel=0 --repo_env TF_NEED_ROCM=1 --config=no_tfrt
INFO: Found applicable config definition build:no_tfrt in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/ir,tensorflow/compiler/mlir/tfrt/ir/mlrt,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ifrt,tensorflow/compiler/mlir/tfrt/tests/mlrt,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_jitrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/compiler/mlir/tfrt/transforms/mlrt,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/runtime_fallback/test,tensorflow/core/runtime_fallback/test/gpu,tensorflow/core/runtime_fallback/test/saved_model,tensorflow/core/runtime_fallback/test/testdata,tensorflow/core/tfrt/stubs,tensorflow/core/tfrt/tfrt_session,tensorflow/core/tfrt/mlrt,tensorflow/core/tfrt/mlrt/attribute,tensorflow/core/tfrt/mlrt/kernel,tensorflow/core/tfrt/mlrt/bytecode,tensorflow/core/tfrt/mlrt/interpreter,tensorflow/compiler/mlir/tfrt/translate/mlrt,tensorflow/compiler/mlir/tfrt/translate/mlrt/testdata,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils,tensorflow/core/tfrt/utils/debug,tensorflow/core/tfrt/saved_model/python,tensorflow/core/tfrt/graph_executor/python,tensorflow/core/tfrt/saved_model/utils
INFO: Found applicable config definition build:linux in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --copt=-Wswitch --copt=-Werror=switch --copt=-Wno-error=unused-but-set-variable --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
Loading: 
DEBUG: $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/third_party/repo.bzl:132:14: 
Warning: skipping import of repository 'llvm-raw' because it already exists.
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 
Loading: 0 packages loaded
INFO: Repository local_config_rocm instantiated at:
  $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/WORKSPACE:19:15: in <toplevel>
  $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/workspace2.bzl:90:19: in workspace
  $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/workspace2.bzl:626:19: in workspace
  $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/workspace2.bzl:80:19: in _tf_toolchains
Repository rule rocm_configure defined at:
  $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl:832:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
   Traceback (most recent call last):
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 810, column 38, in _rocm_autoconf_impl
                _create_local_rocm_repository(repository_ctx)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 599, column 27, in _create_local_rocm_repository
                rocm_libs = _find_libs(repository_ctx, rocm_config, hipfft_or_rocfft, miopen_path, rccl_path, bash_bin)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 366, column 34, in _find_libs
                return _select_rocm_lib_paths(repository_ctx, libs_paths, bash_bin)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 328, column 36, in _select_rocm_lib_paths
                auto_configure_fail("Cannot find rocm library %s" % name)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 153, column 9, in auto_configure_fail
                fail("\n%sROCm Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail: 
ROCm Configuration Error: Cannot find rocm library amdhip64
ERROR: $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/WORKSPACE:19:15: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 810, column 38, in _rocm_autoconf_impl
                _create_local_rocm_repository(repository_ctx)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 599, column 27, in _create_local_rocm_repository
                rocm_libs = _find_libs(repository_ctx, rocm_config, hipfft_or_rocfft, miopen_path, rccl_path, bash_bin)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 366, column 34, in _find_libs
                return _select_rocm_lib_paths(repository_ctx, libs_paths, bash_bin)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 328, column 36, in _select_rocm_lib_paths
                auto_configure_fail("Cannot find rocm library %s" % name)
        File "$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/tsl/third_party/gpus/rocm_configure.bzl", line 153, column 9, in auto_configure_fail
                fail("\n%sROCm Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail: 
ROCm Configuration Error: Cannot find rocm library amdhip64
ERROR: Skipping '//xla/extension:xla_extension': no such package '@local_config_rocm//rocm': 
ROCm Configuration Error: Cannot find rocm library amdhip64
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_rocm//rocm': 
ROCm Configuration Error: Cannot find rocm library amdhip64
INFO: Elapsed time: 34.221s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
make: *** [Makefile:26: $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Błąd 1
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

After mentioned fix:

$ mix compile
rm -f $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/extension && \
        ln -s "$HOME/xla/extension" $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/extension && \
        cd $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb && \
        bazel build --define "framework_shared_object=false" -c opt   --config=rocm --action_env=HIP_PLATFORM=amd --action_env=TF_ROCM_AMDGPU_TARGETS="gfx1100" //xla/extension:xla_extension && \
        mkdir -p $HOME/.cache/xla/0.6.0/cache/build/ && \
        cp -f $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/bazel-bin/xla/extension/xla_extension.tar.gz $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz
INFO: Reading 'startup' options from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --windows_enable_symlinks
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --features=-force_no_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility
INFO: Found applicable config definition build:short_logs in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:rocm in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --crosstool_top=@local_config_rocm//crosstool:toolchain --define=using_rocm_hipcc=true --define=tensorflow_mkldnn_contraction_kernel=0 --repo_env TF_NEED_ROCM=1 --config=no_tfrt
INFO: Found applicable config definition build:no_tfrt in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/ir,tensorflow/compiler/mlir/tfrt/ir/mlrt,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ifrt,tensorflow/compiler/mlir/tfrt/tests/mlrt,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_jitrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/compiler/mlir/tfrt/transforms/mlrt,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/runtime_fallback/test,tensorflow/core/runtime_fallback/test/gpu,tensorflow/core/runtime_fallback/test/saved_model,tensorflow/core/runtime_fallback/test/testdata,tensorflow/core/tfrt/stubs,tensorflow/core/tfrt/tfrt_session,tensorflow/core/tfrt/mlrt,tensorflow/core/tfrt/mlrt/attribute,tensorflow/core/tfrt/mlrt/kernel,tensorflow/core/tfrt/mlrt/bytecode,tensorflow/core/tfrt/mlrt/interpreter,tensorflow/compiler/mlir/tfrt/translate/mlrt,tensorflow/compiler/mlir/tfrt/translate/mlrt/testdata,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils,tensorflow/core/tfrt/utils/debug,tensorflow/core/tfrt/saved_model/python,tensorflow/core/tfrt/graph_executor/python,tensorflow/core/tfrt/saved_model/utils
INFO: Found applicable config definition build:linux in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --copt=-Wswitch --copt=-Werror=switch --copt=-Wno-error=unused-but-set-variable --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
Loading: 
Loading: 
Loading: 0 packages loaded
Analyzing: target //xla/extension:xla_extension (1 packages loaded, 0 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (35 packages loaded, 14 targets configured)
Analyzing: target //xla/extension:xla_extension (68 packages loaded, 217 targets configured)
Analyzing: target //xla/extension:xla_extension (163 packages loaded, 11268 targets configured)
Analyzing: target //xla/extension:xla_extension (165 packages loaded, 11806 targets configured)
Analyzing: target //xla/extension:xla_extension (178 packages loaded, 12976 targets configured)
Analyzing: target //xla/extension:xla_extension (179 packages loaded, 13496 targets configured)
Analyzing: target //xla/extension:xla_extension (179 packages loaded, 13496 targets configured)
Analyzing: target //xla/extension:xla_extension (180 packages loaded, 14375 targets configured)
Analyzing: target //xla/extension:xla_extension (181 packages loaded, 15160 targets configured)
Analyzing: target //xla/extension:xla_extension (181 packages loaded, 15160 targets configured)
Analyzing: target //xla/extension:xla_extension (181 packages loaded, 15160 targets configured)
INFO: Analyzed target //xla/extension:xla_extension (182 packages loaded, 15807 targets configured).
INFO: Found 1 target...
[0 / 71] [Prepa] BazelWorkspaceStatusAction stable-status.txt ... (3 actions, 2 running)
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/llvm-project/llvm/BUILD.bazel:191:11: Compiling llvm/lib/Demangle/Demangle.cpp [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @llvm-project//llvm:Demangle) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 76 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling gzwrite.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling inftrees.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/llvm-project/llvm/BUILD.bazel:191:11: Compiling llvm/lib/Demangle/RustDemangle.cpp [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @llvm-project//llvm:Demangle) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 76 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling inflate.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling inffast.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling adler32.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling trees.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling crc32.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling deflate.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling zutil.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling gzclose.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling infback.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling compress.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling gzlib.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/zlib/BUILD.bazel:5:11: Compiling uncompr.c [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @zlib//:zlib) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 37 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.
Target //xla/extension:xla_extension failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/llvm-project/mlir/BUILD.bazel:9224:10 Middleman _middlemen/@llvm-project_S_Smlir_Cmlir-tblgen-BazelCppSemantics_build_arch_k8-opt-exec-50AE0418 failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @llvm-project//llvm:Demangle) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 76 arguments skipped)
INFO: Elapsed time: 29.007s, Critical Path: 0.09s
INFO: 65 processes: 63 internal, 2 local.
FAILED: Build did NOT complete successfully
make: *** [Makefile:26: $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Błąd 1
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

Somehow it does not detects properly the gcc. Surprisingly by default it's specific location is not in the PATH variable:

# same with 9.5.0 version
export PATH="$PATH:/usr/libexec/gcc/x86_64-pc-linux-gnu/13/"

The final result is:

$ mix compile
rm -f $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/extension && \
        ln -s "$HOME/xla/extension" $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/extension && \
        cd $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb && \
        bazel build --define "framework_shared_object=false" -c opt   --config=rocm --action_env=HIP_PLATFORM=amd --action_env=TF_ROCM_AMDGPU_TARGETS="gfx1100" //xla/extension:xla_extension && \
        mkdir -p $HOME/.cache/xla/0.6.0/cache/build/ && \
        cp -f $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/bazel-bin/xla/extension/xla_extension.tar.gz $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz
INFO: Reading 'startup' options from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --windows_enable_symlinks
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --features=-force_no_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility
INFO: Found applicable config definition build:short_logs in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:rocm in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --crosstool_top=@local_config_rocm//crosstool:toolchain --define=using_rocm_hipcc=true --define=tensorflow_mkldnn_contraction_kernel=0 --repo_env TF_NEED_ROCM=1 --config=no_tfrt
INFO: Found applicable config definition build:no_tfrt in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/ir,tensorflow/compiler/mlir/tfrt/ir/mlrt,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ifrt,tensorflow/compiler/mlir/tfrt/tests/mlrt,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_jitrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/compiler/mlir/tfrt/transforms/mlrt,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/runtime_fallback/test,tensorflow/core/runtime_fallback/test/gpu,tensorflow/core/runtime_fallback/test/saved_model,tensorflow/core/runtime_fallback/test/testdata,tensorflow/core/tfrt/stubs,tensorflow/core/tfrt/tfrt_session,tensorflow/core/tfrt/mlrt,tensorflow/core/tfrt/mlrt/attribute,tensorflow/core/tfrt/mlrt/kernel,tensorflow/core/tfrt/mlrt/bytecode,tensorflow/core/tfrt/mlrt/interpreter,tensorflow/compiler/mlir/tfrt/translate/mlrt,tensorflow/compiler/mlir/tfrt/translate/mlrt/testdata,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils,tensorflow/core/tfrt/utils/debug,tensorflow/core/tfrt/saved_model/python,tensorflow/core/tfrt/graph_executor/python,tensorflow/core/tfrt/saved_model/utils
INFO: Found applicable config definition build:linux in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --copt=-Wswitch --copt=-Werror=switch --copt=-Wno-error=unused-but-set-variable --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file $HOME/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
Loading: 
Loading: 
Loading: 0 packages loaded
Analyzing: target //xla/extension:xla_extension (0 packages loaded, 0 targets configured)
INFO: Analyzed target //xla/extension:xla_extension (1 packages loaded, 2 targets configured).
INFO: Found 1 target...
[0 / 4] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/llvm-project/llvm/BUILD.bazel:191:11: Compiling llvm/lib/Demangle/MicrosoftDemangleNodes.cpp [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @llvm-project//llvm:Demangle) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 76 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
In file included from external/llvm-project/llvm/lib/Demangle/MicrosoftDemangleNodes.cpp:13:
external/llvm-project/llvm/include/llvm/Demangle/MicrosoftDemangleNodes.h:16:10: fatal error: array: No such file or directory
   16 | #include <array>
      |          ^~~~~~~
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/llvm-project/llvm/BUILD.bazel:360:11: Compiling llvm/lib/TableGen/Parser.cpp [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @llvm-project//llvm:TableGen) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 81 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
In file included from external/llvm-project/llvm/include/llvm/ADT/STLExtras.h:20,
                 from external/llvm-project/llvm/include/llvm/TableGen/Parser.h:16,
                 from external/llvm-project/llvm/lib/TableGen/Parser.cpp:9:
external/llvm-project/llvm/include/llvm/ADT/ADL.h:12:10: fatal error: type_traits: No such file or directory
   12 | #include <type_traits>
      |          ^~~~~~~~~~~~~
compilation terminated.
ERROR: $HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/external/llvm-project/llvm/BUILD.bazel:191:11: Compiling llvm/lib/Demangle/RustDemangle.cpp [for tool] failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @llvm-project//llvm:Demangle) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 76 arguments skipped)
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:159: SyntaxWarning: invalid escape sequence '\.'
  re.search('\.cpp$|\.cc$|\.c$|\.cxx$|\.C$', f)]
$HOME/.cache/bazel/_bazel_eiji/24d5313c51521fe9b48548641c43cdae/execroot/xla/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
In file included from external/llvm-project/llvm/lib/Demangle/RustDemangle.cpp:14:
external/llvm-project/llvm/include/llvm/Demangle/Demangle.h:12:10: fatal error: cstddef: No such file or directory
   12 | #include <cstddef>
      |          ^~~~~~~~~
compilation terminated.
Target //xla/extension:xla_extension failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.269s, Critical Path: 0.07s
INFO: 47 processes: 44 internal, 3 local.
FAILED: Build did NOT complete successfully
make: *** [Makefile:26: $HOME/.cache/xla/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Błąd 1
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

However the header files already existing within gcc installation: /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13. What have surprised me is lots of Loding: lines without any other information. In last build attempt the number of such lines decreased to just 2. Maybe I still don't have 2 things found or installed?

So far I was unmasking unsupported packages, compilling few configurations of gcc and clang and even editing source files. I'm a bit tired today and it would be a big relief if somebody could help me with this environment setup. Have I missed something? Are new AMD GPUs even supported? Or maybe there are other problems in source files? Maybe should I try some unreleased branches?

Here are some information about my setup:

$ asdf current
bazel           6.1.2           $HOME/.tool-versions
elixir          ref:v1.16.2     $HOME/.tool-versions
erlang          26.2.3          $HOME/.tool-versions
imagemagick     7.1.1-29        $HOME/.tool-versions
java            temurin-21.0.2+13.0.LTS $HOME/.tool-versions
nodejs          21.6.2          $HOME/.tool-versions
php             8.3.4           $HOME/.tool-versions
postgres        16.2            $HOME/.tool-versions
ruby            3.3.0           $HOME/.tool-versions
rust            1.76.0          $HOME/.tool-versions
sqlite          3.45.2          $HOME/.tool-versions

image image

josevalim commented 3 months ago

There are other issues about compiling ROCm which you can investigate. Unfortunately, those issues are really coming from Bazel, so there may not be much we can do from this project.

Eiji7 commented 3 months ago

@josevalim For now I don't have any ideas, but I can work on my setup if you have some. I saw that not much people use ROCm here, so I can do testing if you could guide me what can I do now.

jonatanklosko commented 3 months ago

Yeah, it's really Bazel and XLA. ROCm is definitely not as prioritized and widely used, so there seem to be more issues with getting the build environment right. I would try building the binary within Docker, see https://github.com/elixir-nx/xla/issues/63#issuecomment-1817744344.

jalberto commented 1 month ago

@jonatanklosko may I ask how are you able to build the binary in docker?

I am trying to reproduce it in Linux machine using the provided Dockerfile and I get a ton of errors, I am able to solve some, but I reach a point where it seems I need to start modifying code in the libraries not only in the environment.

jonatanklosko commented 1 month ago

@jalberto interesting, the build itself doesn't require an actual GPU, so the Docker build should be reproducible. What kind of errors are you getting?

jalberto commented 1 month ago

@jonatanklosko I tried in a clean env, with a new clone of the repo, I also remove build and .cache dir before each run and use build/build.sh rocm:

1st error, easy to solve:

[2/2] STEP 19/21: COPY Makefile Makefile.win ./
Error: building at STEP "COPY Makefile Makefile.win ./": checking on sources under "/home/ja/Projects/Misc/tmp/xla": copier: stat: "/Makefile.win": no such file or directory

after that fix, we are in the correct path: Successfully tagged localhost/xla-rocm:latest

After a while:

[1,954 / 6,477] Compiling xla/service/gpu/runtime3/custom_call_thunk.cc; 4s local ... (16 actions, 15 running)
ERROR: /root/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/service/gpu/BUILD:1158:23: Compiling xla/service/gpu/cub_sort_kernel.cu.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/service/gpu:cub_sort_kernel_f64) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 100 arguments skipped)
Warning: HIP_PLATFORM=hcc is deprecated. Please use HIP_PLATFORM=amd.
clang++: warning: argument unused during compilation: '-fcuda-flush-denormals-to-zero' [-Wunused-command-line-argument]
Warning: HIP_PLATFORM=hcc is deprecated. Please use HIP_PLATFORM=amd.
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr43 = V_MOV_B32_dpp undef $vgpr43(tied-def 0), $vgpr4, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr4 = V_MOV_B32_dpp undef $vgpr4(tied-def 0), killed $vgpr3, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr3 = V_MOV_B32_dpp undef $vgpr3(tied-def 0), $vgpr2, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr47 = V_MOV_B32_dpp undef $vgpr47(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr44 = V_MOV_B32_dpp undef $vgpr44(tied-def 0), $vgpr43, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr47 = V_MOV_B32_dpp undef $vgpr47(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr44 = V_MOV_B32_dpp undef $vgpr44(tied-def 0), $vgpr43, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr43 = V_MOV_B32_dpp undef $vgpr43(tied-def 0), $vgpr4, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr47 = V_MOV_B32_dpp undef $vgpr47(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr44 = V_MOV_B32_dpp undef $vgpr44(tied-def 0), $vgpr43, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr47 = V_MOV_B32_dpp undef $vgpr47(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr44 = V_MOV_B32_dpp undef $vgpr44(tied-def 0), $vgpr43, 322, 15, 15, 0, implicit $exec
12 errors generated when compiling for gfx1036.
Target //xla/extension:xla_extension failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 150.258s, Critical Path: 42.81s
INFO: 1972 processes: 451 internal, 1521 local.
FAILED: Build did NOT complete successfully
make: *** [Makefile:26: /build/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Error 1
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".

Then I changed HIP_PLAFORM as indicated in the warning, and it can progress a bit more, until:

[3,920 / 6,478] Compiling xla/mlir_hlo/mhlo/transforms/legalize_to_linalg/legalize_to_linalg.cc; 17s local ... (16 actions, 15 running)
ERROR: /root/.cache/xla_extension/xla-771e38178340cbaaef8ff20f44da5407c15092cb/xla/service/gpu/BUILD:1158:23: Compiling xla/service/gpu/cub_sort_kernel.cu.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/service/gpu:cub_sort_kernel_f32) external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 100 arguments skipped)
clang++: warning: argument unused during compilation: '-fcuda-flush-denormals-to-zero' [-Wunused-command-line-argument]
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr42 = V_MOV_B32_dpp undef $vgpr42(tied-def 0), $vgpr4, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr4 = V_MOV_B32_dpp undef $vgpr4(tied-def 0), killed $vgpr3, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr3 = V_MOV_B32_dpp undef $vgpr3(tied-def 0), $vgpr2, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr105 = V_MOV_B32_dpp undef $vgpr105(tied-def 0), $vgpr103, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr102 = V_MOV_B32_dpp undef $vgpr102(tied-def 0), $vgpr100, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr105 = V_MOV_B32_dpp undef $vgpr105(tied-def 0), $vgpr103, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr102 = V_MOV_B32_dpp undef $vgpr102(tied-def 0), $vgpr100, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr42 = V_MOV_B32_dpp undef $vgpr42(tied-def 0), $vgpr4, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr105 = V_MOV_B32_dpp undef $vgpr105(tied-def 0), $vgpr103, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr102 = V_MOV_B32_dpp undef $vgpr102(tied-def 0), $vgpr100, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr105 = V_MOV_B32_dpp undef $vgpr105(tied-def 0), $vgpr103, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr102 = V_MOV_B32_dpp undef $vgpr102(tied-def 0), $vgpr100, 322, 15, 15, 0, implicit $exec
12 errors generated when compiling for gfx1036.
[3,925 / 6,478] Compiling xla/mlir_hlo/mhlo/transforms/legalize_to_linalg/legalize_to_linalg.cc; 18s local ... (15 actions, 14 running)
Target //xla/extension:xla_extension failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 597.807s, Critical Path: 85.17s
INFO: 3940 processes: 454 internal, 3486 local.
FAILED: Build did NOT complete successfully
make: *** [Makefile:26: /build/0.6.0/cache/build/xla_extension-x86_64-linux-gnu-rocm.tar.gz] Error 1
** (Mix) Could not compile with "make" (exit status: 2).
You need to have gcc and make installed. If you are using
Ubuntu or any other Debian-based system, install the packages
"build-essential". Also install "erlang-dev" package if not
included in your Erlang/OTP version. If you're on Fedora, run
"dnf group install 'Development Tools'".
jonatanklosko commented 1 month ago

@jalberto ah yeah, the first error is because I removed the file and forget to update, I've just fixed on main. The build error is very confusing, I was suspecting the base image may have changed, but it hasn't. I can't think of anything else that could've changed since I built using that image :<

jonatanklosko commented 1 month ago

@jalberto I've just run build/build.sh rocm on a fresh AWS amd64 instance with Ubuntu 20.04 and it run without failure. I'm wondering if the issue could be that you build on the machine with the actual GPU and the build somehow runs additional logic/checks because of that, but I'm really just guessing.

jalberto commented 1 month ago

@jonatanklosko that could be, but I am not mounting any device, so the container has not access to /dev/dri

I will continue trying around, maybe is my system, but the main reason to use containers to build is to isolate from the host, so it is very odd

jonatanklosko commented 1 month ago

@Eiji7 you can try the new release and use ROCm 6.0, see https://github.com/elixir-nx/xla/issues/82#issuecomment-2124230058.

Eiji7 commented 1 month ago

@jonatanklosko Oh, that's definitely interesting, however I would need to wait for Gentoo maintainers first since version 6 is masked because of runtime issues, see:

# Patrick Lauer patrick@gentoo.org (2023-12-23) # ROCm-6 builds but has runtime issues for me

Source: gentoo/gentoo@563b5ab

jonatanklosko commented 1 month ago

Yeah, it looks like latest XLA requires 6.0+, so I think this ship has sailed on this side.

I don't think there's anything else we can do for 5.7, so I'm going to close this in favour of #82. Feel free to drop more comments if anything changes!

polvalente commented 1 month ago

For what it's worth, IREE might be able to provide a way out. We're focusing on Metal support, but we just might get ROCm "for free"