xuhuisheng / rocm-gfx803

185 stars 9 forks source link

Installing rocblas 5.1.1 deb might break system updates or gets uninstalled #14

Open durdin85 opened 2 years ago

durdin85 commented 2 years ago

Hi, AMD seems updated ROCM 5.1.1 to build 50101 since a while so the dirty version is no longer considered superior and would get uninstalled with every next system update. So manual installation needs to be done each time.

It is also possible to pin the dirty version, but then all system updates are effectively blocked, as there are unmet dependencies:

Building dependency tree... Done
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 rocblas-dev : Depends: rocblas (>= 2.43.0.50101) but 2.43.0-490c4140~dirty is installed
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).

Trying suggested way results in removal of entire ROCm.

I am not sure how APT handles the dependencies, but probably easiest way is to increase rocblas version to something like 2.43.0.99999 ? Btw latest ROCm is now 5.1.3, would it be possible to bump the version to this one, maybe that would work as well?

durdin85 commented 2 years ago

I've just tried my tensorflow code and also benchmark with stock AMD rocblas 5.1.1 and it works, so I believe there's no need to install patched rocblas on 5.1.1 any more. Looking for kernels, there are gfx803 files on AMD deb. If you run python in venv and add the library to path export LD_LIBRARY_PATH=/opt/rocm/lib to this there might be not need to tamper with system files any more.

xuhuisheng commented 2 years ago

@durdin85 seems r9 fury and r9nano didnot have issues. I havenot test them, cannot confirm it.

I think 2.43.999 is a good idea, I will have a try.

xuhuisheng commented 2 years ago

@durdin85 Add patch version to deb package version, just like offcial package. Please have a try: https://github.com/xuhuisheng/rocm-gfx803/releases/download/rocm513/rocblas_2.43.0.50103-66-f0273f26.dirty_amd64.deb

And my RX580 is Polaris. before patched, there are still failed. https://github.com/ROCmSoftwarePlatform/rocBLAS/issues/1218