ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocBLAS/en/latest/
Other
331 stars 152 forks source link

[Feature]: old Tensile_LOGIC source is need, like navi21 navi22 navi23 navi24 for diy #1408

Closed LYC878484 closed 3 weeks ago

LYC878484 commented 3 months ago

Is your feature request related to a problem? Please describe.

I need to build rocBLAS for AMD6500XT which is navi24, however path(library\src\blas3\Tensile\Logic\asm_full) did not have code about navi24, if i simply use navi21 maybe Introducing some unknown problems.

Describe the solution you'd like

So, Whether navi2x related code can be push commit,please please please, for Children of poor families,,,,

Describe alternatives you've considered

Entry card not worth speed too much time to mantain,but someone interested will do it self

Additional context

Add any other context or screenshots about the feature request here.

Library context

Software version
rocblas v0.0

The above Table information can be queried with:

Ubuntu/Debian:
dpkg -s rocblas | grep Version
Centos/RHEL:
rpm -qa | grep rocblas
SLES:
zypper se -s | grep rocblas
NaveenElumalaiAMD commented 3 months ago

Hi @LYC878484 ,

AMD 6500 XT is not officially supported by AMD. Please refer to the System Requirement link https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html

Unofficially some people have a workaround by setting environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0 This is not supported by AMD but some have had success with it.

Naveen

cgmb commented 3 months ago

if i simply use navi21 maybe Introducing some unknown problems

Navi 24 is fully tested on Debian. They are just loading Navi 21 code objects on Navi 24 hardware. You can view the test results at https://ci.rocm.debian.net/packages/r/rocblas/

The Debian packages are not officially supported by AMD, but you may nevertheless find their test results to be informative.

wfjsw commented 3 months ago

Hey there,

I'm coordinating an effort to build Tensile logics for these old GPUs. Let me know if you'd like to join. Linux is required for the process though the final result can be used on Windows.

cgmb commented 2 months ago

I'm coordinating an effort to build Tensile logics for these old GPUs. Let me know if you'd like to join. Linux is required for the process though the final result can be used on Windows.

I'm not able to join, but if you achieve better performance than just running the Navi 21 kernels on Navi 22, 23 or 24, please do report back. I'm also interested in tuning for Navi 10, 12 or 14.

wfjsw commented 2 months ago

I'm not able to join, but if you achieve better performance than just running the Navi 21 kernels on Navi 22, 23 or 24, please do report back. I'm also interested in tuning for Navi 10, 12 or 14.

Currently I have logics for navi 22 and 23 on hand, but I have no idea about its performance in comparison with navi 21 kernel, since I don't own such cards.

cgmb commented 2 months ago

I mean if you achieve better performance on Navi 22 or Navi 23 hardware using your custom logic files as compared to using HSA_OVERRIDE_GFX_VERSION=10.3.0.

wfjsw commented 2 months ago

I don't think HSA_OVERRIDE_GFX_VERSION works on Windows. It is still looking for the library file regardless of that variable. See https://github.com/ollama/ollama/issues/3107

cgmb commented 2 months ago

That is true. I'm mostly curious whether your work would be beneficial to users on Linux as well. If the performance is better than just using the override, it might be useful to them too.

NaveenElumalaiAMD commented 3 weeks ago

Hi @LYC878484 , I will go ahead and close this issue, and feel free to reopen this ticket or open a new one if you have any further questions. Thanks