Open 2440020096 opened 5 months ago
Thanks @2440020096 for reporting the issue. The team has just completed the Navi31 support, and with this feature request we'll start to look into Navi21 support. We'll update when we have plan to share. cc @joviliast @zhanglx13
Thanks for reporting. I cannot assign this to Illia (due to github id issue) so I assigned myself. Will update here when we have some results.
Thanks @2440020096 for reporting the issue. The team has just completed the Navi31 support, and with this feature request we'll start to look into Navi21 support. We'll update when we have plan to share. cc @joviliast @zhanglx13
This is fantastic news that Navi31 support is complete. Is there any documentation/release notes on this being available? i'd love to read more but finding out info on RDNA3 to see how the rest of the components such as memory efficient flash attention can fall into place for RDNA3 would be awesome.
When running 03-matrix-multiply the performance is much lower compared to rocBLAS
Additionally, when trying to run 06-fused-attention, it fails with this error:
Hardware: RX 6800 XT / Navi21 / gfx1030 Pytorch version: 2.4.0.dev20240405+rocm6.0 Triton version: 3.0.0+0a22a91d04, installed via nightly pytorch rocm 6.0 OS: Ubuntu 22.04.3 LTS
Paging @sunway513