Closed patricklauer closed 6 years ago
Tried with both amdgpu
and radeon
as kernel drivers, both fail in the same way.
Both ROCR-Runtime
and ROCT-Thunk-Interface
are version 1.4
Kavari is not officially supported by ROCm Platform, ROCm primary focus is on Server Based Computing, but we recumbent AMD Ryzen CPU, Haswell or newer Intel Core I3,I5 and I7, XeonE3 and Intel Xeon E5 CPU’s. We recommend our GFX8 CPU Fiji and Polaris based.
Note Kaveri was only used by AMD HSA development team as a development vehicle to get the part of the base stack up prior to HSA 1.0 enabled devices were made available. Kaveri has number of architecual limitation. One big one is how how the GPU and CPU are interconnected if you try to use Coherent interconnect.
On Linux Kernel 4.9 support , ROCm we is just currently moving to Linux Kernel 4.9 to be supported, it should be part of the next release.
Please do not Mix the old Radeon Driver and ROCm driver they are not compatible, we need the new base linux stack from the AMDGPU driver.
Thanks
On Feb 26, 2017, at 10:04 AM, patricklauer notifications@github.com<mailto:notifications@github.com> wrote:
Tried with both amdgpu and radeon as kernel drivers, both fail in the same way. Both ROCR-Runtime and ROCT-Thunk-Interface are version 1.4
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/RadeonOpenCompute/ROCR-Runtime/issues/21#issuecomment-282566041, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD8DucLab7OGVpYnjc07WtCKUq_3Xsruks5rgaKngaJpZM4MMa1h.
I'm running into the same issue on an Intel Core i7-6700K / Radeon R9 Nano (Fiji), with Ubuntu 16.04. I got a working ROCm stack using the AMD ROCm apt repositories, but want to build from source.
Initializing the hsa runtime succeeded.
Checking finalizer 1.0 extension support succeeded.
Generating function table for finalizer succeeded.
Getting a gpu agent succeeded.
Querying the agent name succeeded.
The agent name is gfx803.
Querying the agent maximum queue size succeeded.
The maximum queue size is 131072.
Creating the queue succeeded.
"Obtaining machine model" succeeded.
"Getting agent profile" succeeded.
Create the program failed with status 0x100b.
0x100b is HSA_STATUS_ERROR_NOT_INITIALIZED
Any suggestions?
Further info:
#512 uname -a
Linux nano 4.9.0-kfd+ #1 SMP Tue Jun 20 10:33:36 CDT 2017 x86_64 x86_64 x86_64 GNU/Linux
#513 lsmod | grep amd
amdkfd 225280 1
amd_iommu_v2 20480 1 amdkfd
amdgpu 2437120 48
i2c_algo_bit 16384 1 amdgpu
ttm 102400 1 amdgpu
drm_kms_helper 155648 1 amdgpu
drm 360448 6 amdgpu,ttm,drm_kms_helper
Found the issue: there's a call to core::ExtensionEntryPoints::LoadFinalizer
with argument library_name
is libhsa-ext-finalize64.so.1
. There's no such library on my machine. On my other system (ROCm installed from AMD *.deb repositories), the lib belongs to hsa-ext-rocr-dev. What is the corresponding source package? Apparently, I would have to build/install that prior to running the sample vector_copy.
Turns out the finalizer is a closed source component. Installing it with "sudo apt install hsa-ext-rocr-dev" made vector_copy succeed.
I had the same issue with @rwvo and solved it with the same solution. Thank you for your analysis!
Additional comment: we need to add a ROCm apt repository to install hsa-ext-rocr-dev. It can be done by following this instruction.
@gstoner
is there a plan to release the source for libhsa-ext-finalize64.so
?
ref: https://github.com/RadeonOpenCompute/ROCR-Runtime/issues/33
@Dekken libhsa-ext-finalize64.so. we replace this compiler with new native opensource LLVM compiler. this was proprietary compiler that we could not release as source
Trying to make HSA/ROC work on an A10-7700K. Building and installing ROCK, ROCT works. With a stock 4.10 kernel initializing hsa runtime fails. Using the patched ROCK kernel things fail a bit later:
strace says: