JuliaGPU / AMDGPU.jl

AMD GPU (ROCm) programming in Julia
Other
281 stars 47 forks source link

rocWMMA support #560

Open radudiaconu0 opened 10 months ago

radudiaconu0 commented 10 months ago

Could you implement rocWMMA to use with Navi3 GPU? From what i understood it uses the AI accelerotors present in them for faster matrix multilplication. I guess this could make it faster for DL tasks.

https://github.com/ROCmSoftwarePlatform/rocWMMA

vchuravy commented 10 months ago

One could, but it would require probably quite a bit of work by someone.

CUDA.jl WMMA support was the work of a full-time master student.

radudiaconu0 commented 10 months ago

One could, but it would require probably quite a bit of work by someone.

CUDA.jl WMMA support was the work of a full-time master student.

wekk it has to be done at some point. why only nvidia to get all the goodies? :P

pxl-th commented 10 months ago

For matrix multiplication we are using rocBLAS, adding wmma support won't affect its performance.

pxl-th commented 10 months ago

And at this moment matrix multiplication is not a bottleneck in DL applications for AMDGPU. Timely memory freeing is.

vchuravy commented 10 months ago

wekk it has to be done at some point. why only nvidia to get all the goodies? :P

Are you volunteering?

radudiaconu0 commented 10 months ago

wekk it has to be done at some point. why only nvidia to get all the goodies? :P

Are you volunteering?

i would like to try