Open kuhar opened 7 months ago
is it a complete install? my windows SDK has it:
Yes, I even used the cursed amdgpu-pro installer.
➜ ls /opt/rocm/bin
amdclang amdclang-cpp hipcc hipcc_cmake_linker_helper hipconfig.pl hipdemangleatp hipfc hipvars.pm roc-obj-extract rocm_agent_enumerator
amdclang++ amdflang hipcc.bin hipconfig hipconvertinplace-perl.sh hipexamine-perl.sh hipify-clang offload-arch roc-obj-ls rocminfo
amdclang-cl amdlld hipcc.pl hipconfig.bin hipconvertinplace.sh hipexamine.sh hipify-perl roc-obj rocm-smi
I think @sogartar faced something similar? can you try with my build script here https://gist.github.com/raikonenfnu/7d2843107929b161b12e56c057e8735d to see if the issue persist?
@raikonenfnu can you first confirm where the clang-offload-bundler binary should be? Do you have it under /opt/rocm like Ben or installed system-wide?
We may need to check for this during the cmake configuration step.
I only have it on /opt/rocm/llvm/bin/
not system wide. IIRC the clang commands to generate the bitcode should not need clang-offload-bundler at all.
I also do not have clang-offload-bundler on my env and was able to compile.
Oh wait you are talking about macrokernel not microkernel, so my previous assumption/comments might be correct here. The previous comments were more about microkernel. I need to check a bit more about samples macrokernel.
I think it may be the --rocm-path
option? I was able to compile hsaco/co with https://github.com/raikonenfnu/macroHipKernel/blob/main/generate_hsaco.sh#L2-L4
Perhaps missing a nogpulib
option?
@kuhar Was able to repro your issue on my system as well. But if I specify export IREE_ROCM_PATH=/opt/rocm
, then my error would be:
(EDIT: Deleted log from using -nogpublib )
(EDIT: this one actually works if we point to where the clang-offload-bundler
live which is /opt/rocm/llvm/bin
)
Seems like if we append rocm llvm path for this it will compile OK:
PATH=$PATH:/opt/rocm/llvm/bin /home/stanley/nod/iree-build-notrace/llvm-project/bin/clang-19 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/opt/rocm -fuse-cuid=none -O3 /home/stanley/nod/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/stanley/nod/iree-build-notrace/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
Thanks, with this set export PATH="$PATH:/opt/rocm/llvm/bin"
it makes more progress and then errors out with:
➜ ninja all
[0/2] Re-checking globbed directories...
[57/332] Generating rocm_executable_cache_test.bin from executable_cache_test.mlir
FAILED: runtime/plugins/hal/drivers/rocm/cts/rocm_executable_cache_test.bin /home/jakub/iree/build/relass/runtime/plugins/hal/drivers/rocm/cts/rocm_executable_cache_test.bin
cd /home/jakub/iree/build/relass/runtime/plugins/hal/drivers/rocm/cts && /home/jakub/iree/build/relass/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --compile-mode=hal-executable --iree-hal-target-backends=rocm --iree-rocm-target-chip=gfx908 /home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir -o rocm_executable_cache_test.bin --iree-hal-executable-object-search-path=\"/home/jakub/iree/build/relass\"
/home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir:15:1: error: cannot find ROCM bitcode files. Check your installation consistency and in the worst case, set --iree-rocm-bc-dir= to a path on your system.
hal.executable.source public @executable {
^
/home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir:15:1: error: failed to serialize executable for target backend rocm
hal.executable.source public @executable {
^
/home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir:15:1: error: failed to serialize executables
hal.executable.source public @executable {
^
[58/332] Generating rocm_command_buffer_dispatch_test.bin from command_buffer_dispatch_test.mlir
I set both IREE_ROCM_PATH
as the cmake variable and exported it as an env var. What am I missing @raikonenfnu?
Separately from solving this, why do we even build this test data in the all
target? I'd assume it should only be a dependency for iree-test-deps
, no?
OK it does work after switching from the rocm installation from the amdgpu-pro installer to https://github.com/nod-ai/TheRock/releases/tag/nightly-staging-20240328.41 , setting -DIREE_ROCM_PATH
, and doing a clean bulid.
The last remaining issue is the following error:
➜ ninja iree-test-deps
[0/2] Re-checking globbed directories...
[1266/1266] Generating kernels_gfx1100.co
FAILED: samples/custom_dispatch/hip/kernels/kernels_gfx1100.co /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
cd /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels && /home/jakub/iree/build/relass/llvm-project/bin/clang-19 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/home/jakub/bin/therock -fuse-cuid=none -O3 /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
In file included from /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu:7:
In file included from /home/jakub/bin/therock/include/hip/hip_runtime.h:62:
In file included from /home/jakub/bin/therock/include/hip/amd_detail/amd_hip_runtime.h:432:
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:194:27: error: use of undeclared identifier 'max'; did you mean 'fmax'?
194 | double __logbw = _LOGBd(_fmaxd(_ABSd(__c), _ABSd(__d)));
| ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:45:16: note: expanded from macro '_fmaxd'
45 | #define _fmaxd max
| ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_math_forward_declares.h:73:19: note: 'fmax' declared here
73 | __DEVICE__ double fmax(double, double);
| ^
In file included from /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu:7:
In file included from /home/jakub/bin/therock/include/hip/hip_runtime.h:62:
In file included from /home/jakub/bin/therock/include/hip/amd_detail/amd_hip_runtime.h:432:
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:227:26: error: use of undeclared identifier 'max'; did you mean 'fmax'?
227 | float __logbw = _LOGBf(_fmaxf(_ABSf(__c), _ABSf(__d)));
| ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:46:16: note: expanded from macro '_fmaxf'
46 | #define _fmaxf max
| ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_math_forward_declares.h:74:18: note: 'fmax' declared here
74 | __DEVICE__ float fmax(float, float);
| ^
2 errors generated when compiling for gfx1100.
ninja: build stopped: subcommand failed
@raikonenfnu @antiagainst should we disable these rocm kernels and make them experimental? They don't seem to work out of the box on a typical linux installation but are included in the main ninja targets all
(sic!) and iree-test-deps
.
Ping. This still doesn't build for me. After manually patching the cuda kernel, I'm hitting an issue with another tool missing from path:
➜ ninja all iree-test-deps
[0/2] Re-checking globbed directories...
[638/2136] Generating kernels_gfx1100.co
FAILED: samples/custom_dispatch/hip/kernels/kernels_gfx1100.co /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
cd /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels && /home/jakub/iree/build/relass/llvm-project/bin/clang-19 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/home/jakub/bin/therock -fuse-cuid=none -O3 /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
/home/jakub/bin/therock/bin/clang-offload-bundler: error: unable to find 'llvm-objcopy' in path
clang-19: error: amdgcn-link command failed with exit code 1 (use -v to see invocation)
[641/2136] Building CXX object tracy/CMakeFiles/IREETracyProfiler.dir/__/__/__/third_party/tracy/profiler/src/main.cpp.o
ninja: build stopped: subcommand failed.
Seems like this needs a very specific system-wide installation.
I'm hitting the same issue..
I tried -DIREE_ROCM_PATH=/opt/rocm/llvm/bin
, but the cmake result says that the hip runtime cannot be found.
-- hip runtime cannot be found in /opt/rocm/llvm/bin.
Please try setting IREE_ROCM_PATH to rocm directory.
Ukernels will not be compiled.
I thought it's fine, so I went with export PATH="$PATH:/opt/rocm/llvm/bin"
. Then it is still complaining about cannot find ROCM bitcode files. Check your installation consistency and in the worst case, set --iree-rocm-bc-dir= to a path on your system.
.
Then I tried -DIREE_ROCM_PATH=/opt/rocm
config. The hip runtime
issue is gone from cmake results. Without setting env, the Executable "clang-offload-bundler" doesn't exist!
error showed up.
If I go with the env config (i.e., export PATH="$PATH:/opt/rocm/llvm/bin"
), it starts complaining error: cannot find ROCM bitcode files
again.
@kuhar @raikonenfnu What is the actual cmake flag and env var that you're using?
the rocm path should be to rocm - like /opt/rocm/ - not the llvm bin dir (that may still not work, but I'm pretty sure trying to specify llvm/bin/ won't work)
The cmake flag seems to be off. I explicitly return /opt/rocm
in the code, like
After doing it, it complains: AMD bitcode module is required by this module but was not found at /opt/rocm/ocml.bc
. I found that the file is located at /opt/rocm/lib/llvm/lib/clang/17/lib/amdgcn/bitcode/ocml.bc
. So I create a symbolic link (i.e., /opt/rocm/ocml.bc) and point it to where it is. Then it compiles. However, I don't see any e2e tests related to
rocm. There is no hip and rocm in the log of
ctest -R tests/e2e`. Are we able to run e2e tests for rocm backend?
It looks like we only test compilation but not execution for rocm backend? @ScottTodd is my understanding correct?
The "rocm" driver is experimental and scheduled to be deleted. The "hip" driver is stable and is tested.
I see, I can run tests now! Thanks for the pointer!
❯ ctest -R tests/e2e/stablehlo_ops/check_hip
Test project /home/nod/iree/build
Start 1672: iree/tests/e2e/stablehlo_ops/check_hip_stream_abs.mlir
1/61 Test #1672: iree/tests/e2e/stablehlo_ops/check_hip_stream_abs.mlir ..................... Passed 0.96 sec
Start 1673: iree/tests/e2e/stablehlo_ops/check_hip_stream_add.mlir
2/61 Test #1673: iree/tests/e2e/stablehlo_ops/check_hip_stream_add.mlir ..................... Passed 0.25 sec
Start 1674: iree/tests/e2e/stablehlo_ops/check_hip_stream_batch_norm_inference.mlir
3/61 Test #1674: iree/tests/e2e/stablehlo_ops/check_hip_stream_batch_norm_inference.mlir .... Passed 0.23 sec
Start 1675: iree/tests/e2e/stablehlo_ops/check_hip_stream_bitcast_convert.mlir
4/61 Test #1675: iree/tests/e2e/stablehlo_ops/check_hip_stream_bitcast_convert.mlir ......... Passed 0.23 sec
Start 1676: iree/tests/e2e/stablehlo_ops/check_hip_stream_broadcast.mlir
5/61 Test #1676: iree/tests/e2e/stablehlo_ops/check_hip_stream_broadcast.mlir ............... Passed 0.24 sec
Start 1677: iree/tests/e2e/stablehlo_ops/check_hip_stream_broadcast_add.mlir
6/61 Test #1677: iree/tests/e2e/stablehlo_ops/check_hip_stream_broadcast_add.mlir ........... Passed 0.23 sec
Start 1678: iree/tests/e2e/stablehlo_ops/check_hip_stream_broadcast_in_dim.mlir
7/61 Test #1678: iree/tests/e2e/stablehlo_ops/check_hip_stream_broadcast_in_dim.mlir ........ Passed 0.25 sec
Start 1679: iree/tests/e2e/stablehlo_ops/check_hip_stream_clamp.mlir
8/61 Test #1679: iree/tests/e2e/stablehlo_ops/check_hip_stream_clamp.mlir ................... Passed 0.28 sec
Start 1680: iree/tests/e2e/stablehlo_ops/check_hip_stream_compare.mlir
9/61 Test #1680: iree/tests/e2e/stablehlo_ops/check_hip_stream_compare.mlir ................. Passed 0.56 sec
Start 1681: iree/tests/e2e/stablehlo_ops/check_hip_stream_complex.mlir
10/61 Test #1681: iree/tests/e2e/stablehlo_ops/check_hip_stream_complex.mlir ................. Passed 0.25 sec
Start 1682: iree/tests/e2e/stablehlo_ops/check_hip_stream_concatenate.mlir
11/61 Test #1682: iree/tests/e2e/stablehlo_ops/check_hip_stream_concatenate.mlir ............. Passed 0.26 sec
Start 1683: iree/tests/e2e/stablehlo_ops/check_hip_stream_constant.mlir
12/61 Test #1683: iree/tests/e2e/stablehlo_ops/check_hip_stream_constant.mlir ................ Passed 0.23 sec
Start 1684: iree/tests/e2e/stablehlo_ops/check_hip_stream_convert.mlir
13/61 Test #1684: iree/tests/e2e/stablehlo_ops/check_hip_stream_convert.mlir ................. Passed 0.34 sec
Start 1685: iree/tests/e2e/stablehlo_ops/check_hip_stream_convolution.mlir
@hanhanW how did you fix this?
I don't have a clean way. It needs local patch. What I did is:
sudo apt install rocm-llvm-dev
).For IREE side, put return /opt/rocm
directly in
Then I hit an error about missing ocml.bc
. After running locate ocml.bc
(or cd /opt/rocm/; find . | grep -i ocml.bc
), I found the location of the file. And I just added a symbolic link for ocml.bc. E.g.,
sudo -s
cd /opt/rocm
ln -s ./lib/llvm/lib/clang/17/lib/amdgcn/bitcode/ocml.bc ocml.bc
Then you should be able to compile and run tests, e.g., ctest -R tests/e2e/stablehlo_ops/check_hip
.
If the cmake flag is fixed, I will no longer need the local patch I guess.
update: we also need export PATH="$PATH:/opt/rocm/llvm/bin"
to make it work.
Error:
My rocm installation is under
/opt/rocm
, the version is 5.7.1.