ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocBLAS/en/latest/
Other
336 stars 157 forks source link

[Bug]: Failed to build rocblas-5.6.0 with Tensile from source #1356

Closed JiaJiDuan closed 11 months ago

JiaJiDuan commented 11 months ago

Describe the bug

When I built rocblas-5.6.0 from source, "TensileCreateLibrary" reported the following error:

Reading logic files: Launching 24 threads for 108 tasks... Reading logic files: Done. [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||] 100% (0.3 secs elapsed) Writing Custom CMake Writing Kernels... Generating kernels: Launching 24 threads... Generating kernels: Done. * Compiling source kernels: Launching 24 threads... Compiling source kernels: Done. Kernel Building elapsed time = 0.8 secs Traceback (most recent call last): File "/home/marco/workspace/rocBLAS/build_local_tensile/virtualenv/lib/python3.8/site-packages/Tensile/bin/TensileCreateLibrary", >line 43, in TensileCreateLibrary() File "/home/marco/workspace/rocBLAS/build_local_tensile/virtualenv/lib/python3.8/site-packages/Tensile/TensileCreateLibrary.py", >line 1400, in TensileCreateLibrary theMasterLibrary = list(masterLibraries.values())[0] IndexError: list index out of range

To Reproduce

Precise version of rocBLAS installed or rocBLAS commit hash if building from source. Steps to reproduce the behavior:

In rocmblas-5.6.0 source code

  1. cmake -S . -B build_local_tensile -G Ninja -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_C_COMPILER=hipcc -DCMAKE_Fortran_COMPTLER=gfortran -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$ROCM_PATH -DCMAKE_PREFIX_PATH=$ROCM_PATH -DAMDGPU_TARGETS=gfx1031 -DTensile_TEST_LOCAL_PATH=/home/marco/workspace/Tensile-rocm-5.6.0
  2. cmake --build build_local_tensile

failed command is

cd /home/marco/workspace/rocBLAS/build_local_tensile/library/src && /home/marco/workspace/rocBLAS/build_local_tensile/virtualenv/lib/python3.8/site-packages/Tensile/bin/TensileCreateLibrary --merge-files --separate-architectures --lazy-library-loading --no-short-file-names --no-library-print-debug --code-object-version=default --cxx-compiler=hipcc --library-format=msgpack --architecture=gfx1031 /home/marco/workspace/rocBLAS/library/src/blas3/Tensile/Logic/asm_full /home/marco/workspace/rocBLAS/build_local_tensile/Tensile HIP

You can see the details at make.log

Expected behavior

build rocblas-5.6.0 with Tensile succeed

Log-files

Add full logfiles to help explain your problem. make.log

Environment

Hardware description
CPU AMD Ryzen 9 5900X 12-Core Processor
GPU AMD Radeon RX 6700 XT

Make sure that ROCm is correctly installed and to capture detailed environment information run the following command:

printf '=== environment\n' > environment.txt &&
printf '\n\n=== date\n' >> environment.txt && date >> environment.txt &&
printf '\n\n=== Linux Kernel\n' >> environment.txt && uname -a  >> environment.txt &&
printf '\n\n=== rocm-smi' >> environment.txt && rocm-smi  >> environment.txt &&
printf '\n\n' >> environment.txt && hipconfig  >> environment.txt &&
printf '\n\n=== rocminfo\n' >> environment.txt && rocminfo  >> environment.txt &&
printf '\n\n=== lspci VGA\n' >> environment.txt && lspci | grep -i vga >> environment.txt

Attach environment.txt environment.txt

Additional context

I guess the problem may be in function generateLogicDataAndSolutions.In this function, my masterLibraries. Keys () contains only fallback, resulting in the final return value to none or empty. I don't know if this is right and how can I fix it

daineAMD commented 11 months ago

Hi @JiaJiDuan,

It looks like you're using a 6700XT which is not supported by ROCm. You can see the official list of supported GPUs in the ROCm Documentation.

That being said, some users have had success using a gfx1031 as you are using. You can browse some discussion in ROCm/Tensile#1936. Essentially a workaround exists by setting the environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 to use the gfx1030 instruction set.

Along with this, there is PR ROCm/rocBLAS#1251 which introduces some Tensile kernels to enable gfx1031 in rocBLAS. This PR hasn't been tested and it's future remains unclear, but there is some discussion there regarding this topic. Since you're running on Linux you should be fine with using HSA_OVERRIDE_GFX_VERSION=10.3.0 without needing the changes there.

Hope this helps, Daine

JiaJiDuan commented 11 months ago

Hi@daineAMD , Thank you very much for your answer.
I will try this method. But the rocFFT I built is already set to -DAMDGPU_TARGETS=gfx1031. If I want to use all math libraries correctly in this way, should I set -DAMDGPU_TARGETS=gfx1030 for all math libraries?

Essentially a workaround exists by setting the environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 to use the gfx1030 instruction set.

Looking forward to your reply. Thanks again.

cgmb commented 11 months ago

I will try this method. But the rocFFT I built is already set to -DAMDGPU_TARGETS=gfx1031. If I want to use all math libraries correctly in this way, should I set -DAMDGPU_TARGETS=gfx1030 for all math libraries?

Yes, you'd want to use -DAMDGPU_TARGETS=gfx1030 for every library. I think rocFFT would use run-time compilation to rebuild at runtime for gfx1030 anyway, but specifying -DAMDGPU_TARGETS=gfx1030 would ensure the kernels are all cached.

JiaJiDuan commented 11 months ago

Yes, you'd want to use -DAMDGPU_TARGETS=gfx1030 for every library. I think rocFFT would use run-time compilation to rebuild at runtime for gfx1030 anyway, but specifying -DAMDGPU_TARGETS=gfx1030 would ensure the kernels are all cached.

Thanks for your reply.I have another question.Will there be official support plans for gfx1031 in the future?
I did not see support for RX7600 or other RDNA3 architecture GPUs on the Linux platform in the documentation. Will it be supported in the future?

cgmb commented 11 months ago

Will there be official support plans for gfx1031 in the future? I did not see support for RX7600 or other RDNA3 architecture GPUs on the Linux platform in the documentation. Will it be supported in the future?

The official support list is determined centrally for all ROCm libraries. The list is found in https://github.com/RadeonOpenCompute/ROCm, so questions about the official support list are probably best raised there. I'm not sure if you'll get an answer or not, but that's where you'd be most likely to receive one.

If you have any specific technical questions about rocBLAS on the RX 7600 (gfx1102) or gfx1031, we can discuss those in issues on the rocBLAS repo. I'm only directing you elsewhere on those questions because official support is a matter of overall ROCm project policy, not just a technical question that an individual engineer can answer.

JiaJiDuan commented 11 months ago

If you have any specific technical questions about rocBLAS on the RX 7600 (gfx1102) or gfx1031, we can discuss those in issues on the rocBLAS repo. I'm only directing you elsewhere on those questions because official support is a matter of overall ROCm project policy, not just a technical question that an individual engineer can answer.

Thank you for your answer. I will close this issue.