xuhuisheng / rocm-build

build scripts for ROCm
Apache License 2.0
181 stars 35 forks source link

question about rock-dkms script #27

Closed zjin-lcf closed 2 years ago

zjin-lcf commented 2 years ago

Running the script 62.rock-dkms.sh generated a .deb file. Does a user need to type dpkg -i rock-dkms_5.1-36_all.deb ?

The following error is shown for the above command. Thanks for your instruction.

dpkg: regarding rock-dkms_5.1-36_all.deb containing rock-dkms, pre-dependency problem: rock-dkms pre-depends on rock-dkms-firmware (= 1:5.1-36) rock-dkms-firmware is not installed.

dpkg: error processing archive rock-dkms_5.1-36_all.deb (--install): pre-dependency problem - not installing rock-dkms Errors were encountered while processing: rock-dkms_5.1-36_all.deb

xuhuisheng commented 2 years ago

I havn't test the rock-dkms for a while, after ROCm-4.3, ROCm used amdgpu instead of rock-dkms, So I don't confirm whether rock-dkms could used as before. And the firmeware is just download from https://repo.radeon.com/rocm/apt/4.3/pool/main/r/rock-dkms/, Actually there is no rock-dkms on ROCm-5.1. Maybe I should just remove 62.rock-dkms.sh

There related version amdgpu is here: https://repo.radeon.com/amdgpu/22.10.1/ubuntu/pool/main/a/amdgpu-dkms/

zjin-lcf commented 2 years ago

I tried to build ROCm components (version 5.1.x) from source with your scripts for Ubuntu 22.04. The target device is gfx1012. Executing the binary built from the HIP compiler shows the message below. Could you advise ? Thanks.

hipErrorNoBinaryForGpu: Unable to find code object for all current devices!" Aborted (core dumped)

The kernel version is 5.15 and the already installed drivers are: rc | amdgpu-install | 21.40.2.40502-1350682 | all | AMDGPU driver repository and installer ii | libdrm-amdgpu1:amd64 | 2.4.110-1ubuntu1 | amd64 | Userspace interface to amdgpu-specific kernel DRM services -- runtime ii | xserver-xorg-video-amdgpu | 22.0.0-1build1 | amd64 | X.Org X server -- AMDGPU display driver

xuhuisheng commented 2 years ago

Actually current version of rocm-build scripts cannot build ROCm on ubuntu-22.04 properly. You can read this post : https://github.com/RadeonOpenCompute/ROCm/issues/1713#issuecomment-1107796180

The kernel of linux is already support AMD gpus, otherwise we won't use AMD gpus in linux, So dont' worried. And If you met hipErrorNoBinaryForGpu, it usually means we didn't compile ROCm with correct AMDGPU_TARGETS,

But it is not easy for us to find which component of ROCm had AMDGPU_TARGETS problems, My suggest is run my check scripts and try to find out which component need re-build. You can reference this. https://github.com/xuhuisheng/rocm-build/tree/master/check

zjin-lcf commented 2 years ago

Thank again.