Open t0b3 opened 1 year ago
rocm version: 5.3.2 mesa: 22.2.3-1
rocm-clinfo
mesa: CommandLine Error: Option 'h' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
fish: Job 1, 'rocm-clinfo' terminated by signal SIGABRT (Abort)
rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 9 7950X 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 7950X 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 4500
BDFID: 0
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 131023320(0x7cf41d8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 131023320(0x7cf41d8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 131023320(0x7cf41d8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1030
Uuid: GPU-8f3c72db82948540
Marketing Name: AMD Radeon RX 6900 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 4096(0x1000) KB
L3: 131072(0x20000) KB
Chip ID: 29615(0x73af)
ASIC Revision: 1(0x1)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2720
BDFID: 768
Internal Node ID: 1
Compute Unit: 80
SIMDs per CU: 2
Shader Engines: 8
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16760832(0xffc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1030
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*******
Agent 3
*******
Name: gfx1036
Uuid: GPU-XX
Marketing Name: AMD Radeon Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 2
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 256(0x100) KB
Chip ID: 5710(0x164e)
ASIC Revision: 1(0x1)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2200
BDFID: 5120
Internal Node ID: 2
Compute Unit: 2
SIMDs per CU: 2
Shader Engines: 1
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 524288(0x80000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1036
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
It fails here: https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime/blob/b0abe308d0a4c9dbf33ef3a925b4128d93d4a799/tools/clinfo/clinfo.cpp#L75
full stacktrace:
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff7aafee3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007ffff7a5faa6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff7a497fc in __GI_abort () at abort.c:79
#4 0x00007fffed76d589 in llvm::report_fatal_error (Reason=..., GenCrashDiag=true) at /usr/src/debug/llvm-15.0.4-1.fc37.x86_64/lib/Support/ErrorHandling.cpp:123
#5 0x00007fffed76d3ca in llvm::report_fatal_error (Reason=<optimized out>, GenCrashDiag=<optimized out>) at /usr/src/debug/llvm-15.0.4-1.fc37.x86_64/lib/Support/ErrorHandling.cpp:83
#6 0x00007fffed750d83 in (anonymous namespace)::CommandLineParser::addOption (this=0x555555586600, O=<optimized out>, SC=0x555555586770) at /usr/src/debug/llvm-15.0.4-1.fc37.x86_64/lib/Support/CommandLine.cpp:243
#7 0x00007fffed73cbe7 in (anonymous namespace)::CommandLineParser::addOption (this=0x205dea, O=0x205dea, ProcessDefaultOption=<optimized out>) at /usr/src/debug/llvm-15.0.4-1.fc37.x86_64/lib/Support/CommandLine.cpp:263
#8 0x00007fffed73add3 in llvm::cl::Option::addArgument (this=0x7fffe0ce3980 <SectionHeadersShorter>) at /usr/src/debug/llvm-15.0.4-1.fc37.x86_64/lib/Support/CommandLine.cpp:447
#9 0x00007fffe04120c9 in llvm::cl::alias::alias<char [8], llvm::cl::desc, llvm::cl::aliasopt> (this=<optimized out>, this=<optimized out>) at /usr/include/llvm/Support/CommandLine.h:1893
#10 0x00007fffe0412c48 in __static_initialization_and_destruction_0 (__priority=65535, __initialize_p=1) at /usr/src/debug/rocm-compilersupport-5.3.0-1.fc37.x86_64/lib/comgr/src/comgr-objdump.cpp:180
#11 0x00007fffe0412332 in _sub_I_65535_0.0 () from /lib64/libamd_comgr.so.2
#12 0x00007ffff7fcccde in call_init (env=0x7fffffffdf18, argv=0x7fffffffdf08, argc=1, l=<optimized out>) at dl-init.c:70
#13 call_init (l=<optimized out>, argc=1, argv=0x7fffffffdf08, env=0x7fffffffdf18) at dl-init.c:26
#14 0x00007ffff7fccdcc in _dl_init (main_map=0x555555835980, argc=1, argv=0x7fffffffdf08, env=0x7fffffffdf18) at dl-init.c:117
#15 0x00007ffff7b71ec4 in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-8.fc37.x86_64/elf/dl-error-skeleton.c:182
#16 0x00007ffff7fd3736 in dl_open_worker (a=a@entry=0x7fffffffd060) at dl-open.c:808
#17 0x00007ffff7b71e6e in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-8.fc37.x86_64/elf/dl-error-skeleton.c:208
#18 0x00007ffff7fd3acc in _dl_open (file=0x7fffec1bbc41 "libamd_comgr.so.2", mode=<optimized out>, caller_dlopen=0x7fffec1252a2 <amd::Os::loadLibrary(char const*)+178>, nsid=<optimized out>, argc=1, argv=0x7fffffffdf08, env=0x7fffffffdf18) at dl-open.c:884
#19 0x00007ffff7aaa25c in dlopen_doit (a=a@entry=0x7fffffffd2d0) at dlopen.c:56
#20 0x00007ffff7b71e6e in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd230, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-8.fc37.x86_64/elf/dl-error-skeleton.c:208
#21 0x00007ffff7b71f23 in __GI__dl_catch_error (objname=0x7fffffffd288, errstring=0x7fffffffd290, mallocedp=0x7fffffffd287, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-8.fc37.x86_64/elf/dl-error-skeleton.c:227
#22 0x00007ffff7aa9d2f in _dlerror_run (operate=operate@entry=0x7ffff7aaa200 <dlopen_doit>, args=args@entry=0x7fffffffd2d0) at dlerror.c:138
#23 0x00007ffff7aaa311 in dlopen_implementation (dl_caller=<optimized out>, mode=<optimized out>, file=<optimized out>) at dlopen.c:71
#24 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81
#25 0x00007fffec1252a2 in amd::Os::loadLibrary_ (filename=0x7fffec1bbc41 "libamd_comgr.so.2") at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/os/os_posix.cpp:177
#26 amd::Os::loadLibrary_ (filename=0x7fffec1bbc41 "libamd_comgr.so.2") at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/os/os_posix.cpp:177
#27 amd::Os::loadLibrary (libraryname=0x7fffec1bbc41 "libamd_comgr.so.2") at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/os/os.cpp:76
#28 0x00007fffec1679fc in amd::Comgr::LoadLib () at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/device/comgrctx.cpp:37
#29 0x00007ffff7ab30a7 in __pthread_once_slow (once_control=0x7fffec1f73f4 <amd::Comgr::initialized>, init_routine=0x7ffff7cda820 <std::__once_proxy()>) at pthread_once.c:116
#30 0x00007fffec118f59 in __gthread_once (__func=<optimized out>, __once=0x7fffec1f73f4 <amd::Comgr::initialized>) at /usr/include/c++/12/x86_64-redhat-linux/bits/gthr-default.h:700
#31 std::call_once<bool (&)()> (__once=..., __f=<optimized out>) at /usr/include/c++/12/mutex:859
#32 amd::Device::ValidateComgr (this=0x5555558351f0) at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/device/device.cpp:533
#33 amd::Device::ValidateComgr (this=0x5555558351f0) at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/device/device.cpp:529
#34 0x00007fffec1aa89f in roc::Device::create (this=0x5555558351f0) at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/device/rocm/rocdevice.cpp:639
#35 roc::Device::init () at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/device/rocm/rocdevice.cpp:489
#36 amd::Device::init () at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/device/device.cpp:454
#37 amd::Runtime::init() [clone .isra.0] () at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/ROCclr-rocm-5.3.2/platform/runtime.cpp:75
#38 0x00007fffec103035 in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<clIcdGetPlatformIDsKHR::{lambda()#1}>(std::once_flag&, clIcdGetPlatformIDsKHR::{lambda()#1}&&)::{lambda()#1}>(clIcdGetPlatformIDsKHR::{lambda()#1}&)::{lambda()#1}::_FUN() () at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/amdocl/cl_icd.cpp:224
#39 0x00007ffff7ab30a7 in __pthread_once_slow (once_control=0x7fffec1f4b34 <clIcdGetPlatformIDsKHR::initOnce>, init_routine=0x7ffff7cda820 <std::__once_proxy()>) at pthread_once.c:116
#40 0x00007fffec102eaf in __gthread_once (__func=<optimized out>, __once=0x7fffec1f4b34 <clIcdGetPlatformIDsKHR::initOnce>) at /usr/include/c++/12/x86_64-redhat-linux/bits/gthr-default.h:700
#41 std::call_once<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> > (__once=..., __f=...) at /usr/include/c++/12/mutex:859
#42 clIcdGetPlatformIDsKHR (num_entries=<optimized out>, platforms=0x0, num_platforms=0x7fffffffd828) at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/amdocl/cl_icd.cpp:274
#43 0x00007ffff7f85df9 in _find_and_check_platforms (num_icds=<optimized out>) at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:469
#44 __initClIcd () at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:773
#45 _initClIcd_real () at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:824
#46 0x00007ffff7f87e14 in _initClIcd () at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:853
#47 clGetPlatformIDs (num_entries=0, platforms=0x0, num_platforms=0x7fffffffd954) at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:1018
#48 0x000055555555e547 in cl::Platform::get (platforms=0x7fffffffdad0) at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/tools/clinfo/../../khronos/headers/opencl2.2/CL/../CL/cl2.hpp:2486
#49 0x0000555555556f58 in main (argc=argc@entry=1, argv=argv@entry=0x7fffffffdf08) at /usr/src/debug/rocm-opencl-5.3.2-1.fc37.x86_64/tools/clinfo/clinfo.cpp:75
#50 0x00007ffff7a4a510 in __libc_start_call_main (main=main@entry=0x555555556e00 <main(int, char**)>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdf08) at ../sysdeps/nptl/libc_start_call_main.h:58
#51 0x00007ffff7a4a5c9 in __libc_start_main_impl (main=0x555555556e00 <main(int, char**)>, argc=1, argv=0x7fffffffdf08, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdef8) at ../csu/libc-start.c:381
#52 0x000055555555ca65 in _start ()
observed behaviour
clIcdGetPlatformIDsKHR()
crashes if multiple llvm based implementations present (using radeonsi GPU)output
backtrace