I've been trying to get LZ working with a radeon rx 580, but it's segfaulting during clEnqueueWriteBuffer.
Quick gdb output:
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.1 Mesa 19.1.1
Platform profile: FULL_PROFILE
Platform name: Clover
Platform vendor: Mesa
Device ID: 0
Device name: Radeon RX 580 Series (POLARIS10, DRM 3.30.0, 5.1.15-arch1-1-ARCH, LLVM 8.0.0)
Device type: GPU
Device vendor: AMD
Device driver: 19.1.1
Device speed: 1366 MHz
Device cores: 36 CU
Device score: 1111
Selected platform: Clover
Selected device: Radeon RX 580 Series (POLARIS10, DRM 3.30.0, 5.1.15-arch1-1-ARCH, LLVM 8.0.0)
with OpenCL 1.1 capability.
Half precision compute support: Yes.
Tensor Core support: No.
OpenCL: using fp16/half or tensor core compute support.
Started OpenCL SGEMM tuner.
Will try 290 valid configurations.
Thread 1 "tests" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) backtrace
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff6d616a2 in ?? () from /usr/lib/libMesaOpenCL.so.1
#2 0x00007ffff6d52e3f in ?? () from /usr/lib/libMesaOpenCL.so.1
#3 0x00007ffff6d53a13 in ?? () from /usr/lib/libMesaOpenCL.so.1
#4 0x00007ffff6d54291 in ?? () from /usr/lib/libMesaOpenCL.so.1
#5 0x00007ffff6d511a5 in ?? () from /usr/lib/libMesaOpenCL.so.1
#6 0x00007ffff6d4ecde in ?? () from /usr/lib/libMesaOpenCL.so.1
#7 0x00007ffff7ecaa4e in clEnqueueWriteBuffer () from /usr/lib/libOpenCL.so.1
#8 0x00005555556212e8 in cl::CommandQueue::enqueueWriteBuffer (blocking=0, offset=0, events=0x0, event=0x0,
ptr=<optimized out>, size=147456, buffer=..., this=<synthetic pointer>)
at /home/sandy/devel/leela-zero/src/CL/cl2.hpp:7166
#9 Tuner<half_float::half>::tune_sgemm[abi:cxx11](int, int, int, int, int) (this=0x7fffffffdb00, m=8, n=25, k=8,
batch_size=36, runs=<optimized out>) at /home/sandy/devel/leela-zero/src/Tuner.cpp:491
#10 0x00005555556227ad in Tuner<half_float::half>::load_sgemm_tuners[abi:cxx11](int, int, int, int) (
this=0x7fffffffdb00, m=8, n=25, k=8, batch_size=36) at /usr/include/c++/9.1.0/ext/new_allocator.h:89
#11 0x00005555556358a6 in OpenCL<half_float::half>::initialize (this=0x555555769a00, channels=8, batch_size=1)
at /home/sandy/devel/leela-zero/src/Tuner.cpp:722
#12 0x0000555555635f05 in OpenCLScheduler<half_float::half>::initialize (this=0x555555769920, channels=8)
at /usr/include/c++/9.1.0/bits/unique_ptr.h:357
#13 0x000055555564b76b in Network::init_net (this=0x7ffff6e04010, channels=8, pipe=...)
at /usr/include/c++/9.1.0/bits/unique_ptr.h:357
#14 0x00005555556536d2 in Network::select_precision (this=0x7ffff6e04010, channels=8)
at /usr/include/c++/9.1.0/bits/move.h:74
#15 0x0000555555653ee2 in Network::initialize (this=0x7ffff6e04010, playouts=<optimized out>, weightsfile=...)
at /home/sandy/devel/leela-zero/src/Network.cpp:573
#16 0x0000555555684979 in LeelaEnv::SetUp (this=<optimized out>) at /home/sandy/devel/leela-zero/src/tests/gtests.cpp:87
#17 0x000055555569af9c in testing::internal::SetUpEnvironment(testing::Environment*) ()
--Type <RET> for more, q to quit, c to continue without paging--
::__normal_iterator<testing::Environment* const*, std::vector<testing::Environment*, std::allocator<testing::Environment*> > >, void (*)(testing::Environment*)))(testing::Environment*) ()
#19 0x00005555556b1931 in void testing::internal::ForEach<std::vector<testing::Environment*, std::allocator<testing::Environment*> >, void (*)(testing::Environment*)>(std::vector<testing::Environment*, std::allocator<testing::Environment*> > const&, void (*)(testing::Environment*)) ()
#20 0x000055555569b207 in testing::internal::UnitTestImpl::RunAllTests() ()
#21 0x00005555556b78c4 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#22 0x00005555556b11e5 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#23 0x0000555555699d2a in testing::UnitTest::Run() ()
#24 0x000055555568824b in RUN_ALL_TESTS() ()
#25 0x00005555556881d9 in main ()
From a quick look I can't see anything obviously wrong, but I'm not very familiar with debugging C++.
I haven't tested the gpu setup much, so I wondered if this could be an issue with my opencl environment, but KataGo does run fine. Any ideas if this could be a LZ issue or must be something else?
I've been trying to get LZ working with a radeon rx 580, but it's segfaulting during clEnqueueWriteBuffer.
Quick gdb output:
From a quick look I can't see anything obviously wrong, but I'm not very familiar with debugging C++.
I haven't tested the gpu setup much, so I wondered if this could be an issue with my opencl environment, but KataGo does run fine. Any ideas if this could be a LZ issue or must be something else?