Closed riaqn closed 2 years ago
If you used ubuntu, we can use apt to see the difference versions informations for rocblas.
You can see the 8328dcce~dirty
part. Because this package is compiled from local and do a patch which isnot committed.
work@ae8ab6747fa6:/opt/rocm/rocblas$ apt search rocblas
Sorting... Done
Full Text Search... Done
rocblas/Ubuntu 2.39.0.40301-59 amd64 [upgradable from: 2.39.0-8328dcce~dirty]
rocBLAS is AMD's library for BLAS on ROCm. It is implemented in HIP and optimized for AMD GPUs.
rocblas4.3.1/Ubuntu 2.39.0.40301-59 amd64
rocBLAS is AMD's library for BLAS on ROCm. It is implemented in HIP and optimized for AMD GPUs.
I am not familiar with arch, maybe you can find a way to compare the version from rocblas packages.
BTW, I uploaded rocblas-2.39 and pytorch-1.9 for gfx803. You can have a try. https://github.com/xuhuisheng/rocm-gfx803/releases/tag/rocm43
There is some differences between offical rocblas and gfx803 patched rocblas.
Go to directory /opt/rocm-4.3.1/rocblas/lib/library
The offcial rocblas has more files for multiple GPU.
-rw-r--r-- 1 root root 22036088 Aug 21 17:51 Kernels.so-000-gfx1010.hsaco
-rw-r--r-- 1 root root 21278328 Aug 21 17:51 Kernels.so-000-gfx1011.hsaco
-rw-r--r-- 1 root root 21278328 Aug 21 17:51 Kernels.so-000-gfx1012.hsaco
-rw-r--r-- 1 root root 20883320 Aug 21 17:51 Kernels.so-000-gfx1030.hsaco
-rw-r--r-- 1 root root 21766128 Aug 21 17:51 Kernels.so-000-gfx803.hsaco
-rw-r--r-- 1 root root 22330912 Aug 21 17:51 Kernels.so-000-gfx900.hsaco
-rw-r--r-- 1 root root 20614864 Aug 21 17:51 Kernels.so-000-gfx906-xnack-.hsaco
-rw-r--r-- 1 root root 20592048 Aug 21 17:51 Kernels.so-000-gfx908-xnack-.hsaco
-rw-r--r-- 1 root root 20716072 Aug 21 17:51 Kernels.so-000-gfx90a-xnack+.hsaco
-rw-r--r-- 1 root root 20703784 Aug 21 17:51 Kernels.so-000-gfx90a-xnack-.hsaco
-rw-r--r-- 1 root root 230109962 Aug 21 17:51 TensileLibrary.dat
-rw-r--r-- 1 root root 112401360 Aug 21 17:51 TensileLibrary_gfx1030.co
-rw-r--r-- 1 root root 3875552 Aug 21 17:51 TensileLibrary_gfx803.co
-rw-r--r-- 1 root root 49228184 Aug 21 17:51 TensileLibrary_gfx900.co
-rw-r--r-- 1 root root 102949336 Aug 21 17:51 TensileLibrary_gfx906.co
-rw-r--r-- 1 root root 304173904 Aug 21 17:51 TensileLibrary_gfx908.co
-rw-r--r-- 1 root root 233813920 Aug 21 17:51 TensileLibrary_gfx90a.co
-rw-r--r-- 1 root root 1349 Aug 21 16:18 TensileManifest.txt
The gfx803 patched rocblas has only gfx803 related fatbin files.
-rw-r--r-- 1 root root 7722904 May 26 13:49 Kernels.so-000-gfx803.hsaco
-rw-r--r-- 1 root root 3942507 May 26 13:49 TensileLibrary.yaml
-rw-r--r-- 1 root root 152 May 26 13:44 TensileManifest.txt
Nevermind! Turns out it's my code that's wrong. The official quick start program (https://www.tensorflow.org/tutorials/quickstart/beginner) can be run successfully! Thank you very much!
As a side note: is deleting library/src/blas3/Tensile/Logic/asm_full/r9nano_*.yaml
necessary?
you can try gfx803 without patch. With ROCm-4.3.1 on text classification, I got a memory access error.
2 months after last posts, I will close this issue, please reopen if there is any updates.
Environment
What is the expected behavior
tensorflow should work as expected
What actually happens
NaN loss when use tensorflow
How to reproduce
library/src/blas3/Tensile/Logic/asm_full/r9nano_*.yaml
)