ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocBLAS/en/latest/
Other
340 stars 157 forks source link

TensileLibrary.dat not found #1331

Closed jinz2014 closed 1 year ago

jinz2014 commented 1 year ago

Install method on Ubuntu 22.04: sudo apt ROCm Version: 5.5.1 Error: TensileLibrary.dat not found

Could you please provide instructions on how to generate TensileLibrary.data for gfx101x devices ?

In the /opt/rocm/lib/rocblas/library/, I find that the devices are gfx90x, gfx1030, gfx110x and fallback

Thanks.

jinz2014 commented 1 year ago

Will it cause Segmentation fault (core dumped) when executing a HIP program that calls rocBLAS APIs on a gfx101x device?

cgmb commented 1 year ago

I'm not a rocBLAS developer, but when building rocBLAS from source, you can add gfx1010, gfx1011 or gfx1012 to the AMDGPU_TARGETS CMake option. This is the ideal way to enable support. However, these architectures are not officially supported and are not tested. There were a number of failures in the test suite last time I tried.

When using AMD's binary packages for rocBLAS, the library is built for gfx1010 and gfx1012. As far as I know, there is no way to build the necessary Tensile files for gfx1011 individually. And, while the Tensile error is the first error you encountered, there are also kernels embedded within the rocBLAS shared library so you would likely encounter hipErrorNoBinaryForGpu errors (which will lead to a segfault) unless you rebuilt the whole library with gfx1011 enabled.

If you are on a gfx101x device and want to use the pre-built AMD binaries, I would recommend setting the environment variable HSA_OVERRIDE_GFX_VERSION=10.1.0. It is not officially supported, but the gfx1011, gfx1012 and gfx1013 ISAs are all supersets of gfx1010. In theory, it should work. This method has not been tested, but then again, neither has any other method, as RDNA 1 GPUs are not officially supported in rocBLAS.

Whatever method you use, I would strongly recommend running the rocblas test suite to validate the library on your chosen architecture.

cgmb commented 1 year ago

Will it cause Segmentation fault (core dumped) when executing a HIP program that calls rocBLAS APIs on a gfx101x device?

Yes, on gfx1011 and gfx1013. You can probably find more information about your crash by setting the environment variable AMD_LOG_LEVEL=3. For more information, see the HIP Debugging Guide.

jinz2014 commented 1 year ago

Thank you for your answers.