Closed lukaszlew closed 4 years ago
The OpenCL build has a similar error:
KataGo v1.3.2 terminate called after throwing an instance of 'StringError' what(): OpenCL error at /home/lew/devel/KataGo/cpp/neuralnet/openclhelpers.cpp, func err, line 188, error CL_PLATFORM_NOT_FOUND_KHR Aborted
Seems related, but I'm not sure.
Do you have a GPU, and are its drivers up to date? Your errors suggest that both versions of KataGo - CUDA and OpenCL - are failing to find a GPU or a on your system.
If you read your error messages you can guess this - failing at "cudaSetDevice(gpuIdxForThisThread)" suggests CUDA is failing when it is trying to set the index of what GPU to use, and "CL_PLATFORM_NOT_FOUND_KHR" sounds like OpenCL cannot find a platform on your computer that has OpenCL or supports accelerated computation.
Any updates on this? Is it resolved, or have you given up at this point? As mentioned above, unless you have other evidence, it seems the problem might just be that the GPU drivers are old or incorrect so that the GPU cannot be detected.
Going ahead and closing. If you have more info and are sure that you do have a working GPU and/or that you have CUDA or OpenCL working but it still doesn't run, feel free to reply back or open a new issue.
same error. I have a working nvidia GPU with proprietary driver
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 960 Off | 00000000:01:00.0 On | N/A | | 0% 42C P0 29W / 120W | 1553MiB / 4040MiB | 4% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 857 G /usr/lib/Xorg 532MiB | | 0 N/A N/A 1611 G /usr/bin/kwin_x11 23MiB | | 0 N/A N/A 1794 G /usr/bin/plasmashell 155MiB | | 0 N/A N/A 17214 G ...gl=desktop --shared-files 727MiB | | 0 N/A N/A 21551 G /usr/bin/krunner 10MiB | | 0 N/A N/A 37360 G /usr/bin/python3 91MiB | +-----------------------------------------------------------------------------+
after some testing I find it's nvidia driver issue. 450.xx with CUDA 11 is incompatible. Downgrading to 440.xx with CUDA 10.2 solves the problem. It solves opencl issue, too. @lukaszlew
sudo apt-get install mesa-opencl-icd removed CL_PLATFORM_NOT_FOUND_KHR I had this issue using ubuntu 20.04 with Mesa DRI Intel HD Graphics 4000 card
however still barfing: "no OpenCL devices were found.. bugy drivers" fully up to date, and from my search post 17.04 GPU driver updates are automatic.
I think this might be an issue with CUDA drivers. Do you know how to debug it?
~/devel/KataGo/cpp$ /home/lew/devel/KataGo/cpp/katago gtp -model /home/lew/devel/KataGo/cpp/models/b20c256-s447913472-d241840887/model.txt.gz -config /home/lew/devel/KataGo/cpp/configs/lew.cfg KataGo v1.3.2 Loaded model /home/lew/devel/KataGo/cpp/models/b20c256-s447913472-d241840887/model.txt.gz GTP ready, beginning main protocol loop terminate called after throwing an instance of 'StringError' what(): CUDA Error, for createComputeHandle file /home/lew/devel/KataGo/cpp/neuralnet/cudabackend.cpp, func cudaSetDevice(gpuIdxForThisThread), line 2706, error unknown error Aborted