Closed ghost closed 4 years ago
Can you check that the drivers you have are the latest from nvidia and not from windows update?
Absolutely, I install only original NVIDIA drivers from the NVIDIA website, now through GeForce Experience.
Is it possible that nobod have a clue?? I'm getting this also with Lc0 v.0.22. I have erased any NVidia drivers, rebooted the system and reinstalled them, nothing to do... My graphic card is an NVidia GeForce 780 TI OC
old?? it's the latest lc0 release!! and I can't switch to Windows 10. I want to point that Lc0 CUDA ran smoothly until a crash when I tried to stream the game with OBS, this software used the graphic card encoding.
What is the actual newest Lc0 then?
The latest version is v0.22.0: https://github.com/LeelaChessZero/lc0/releases/tag/v0.22.0
That is just the one I was writing about.
Log file:
============= Log started. ============= 0811 07:35:00.840475 4492 c:\projects\lc0\src\main.cc:37] Lc0 started. 0811 07:35:00.840544 4492 c:\projects\lc0\src\main.cc:38] 0811 07:35:00.840741 4492 c:\projects\lc0\src\main.cc:39] | | | 0811 07:35:00.840937 4492 c:\projects\lc0\src\main.cc:40] | | |_| v0.22.0 built Aug 5 2019 0811 07:35:00.844105 4492 c:\projects\lc0\src\utils\commandline.cc:45] Command line: lc0.exe --logfile=log.txt 0811 07:35:04.902445 4492 c:\projects\lc0\src\chess\uciloop.cc:131] >> go nodes 100 0811 07:35:04.903053 4492 c:\projects\lc0\src\neural\loader.cc:206] Found pb network file: ./1a167a875c3d9e242f663f30ba877b5b046dcb0c193b79fd43ddacbf8b5b17ed.gz 0811 07:35:05.532228 4492 c:\projects\lc0\src\neural\factory.cc:84] Creating backend [cudnn]... 0811 07:35:05.703535 4492 c:\projects\lc0\src\utils\exception.h:39] Exception: CUDA error: unknown error (c:\projects\lc0\src\neural\cuda\network_cudnn.cc:203) 0811 07:35:05.726708 4492 c:\projects\lc0\src\chess\uciloop.cc:218] << error CUDA error: unknown error (c:\projects\lc0\src\neural\cuda\network_cudnn.cc:203)
Usually the reason for that is mismatch between CUDA .dlls and CUDA drivers. (e.g. 10.0.xxx vs 10.1.xxx). Updating NVidia drivers and rebooting should help.
as I've said so many times, I have updated the drivers any single time. Yet
c:\lc0>lc0.exe | | | | | |_| v0.22.0 built Aug 5 2019 go nodes 100 Found pb network file: ./1.gz Creating backend [cudnn]... error CUDA error: unknown error (c:\projects\lc0\src\neural\cuda\network_cudnn.cc:203)
@massimilianogoi can you try with https://ci.appveyor.com/api/buildjobs/cbesqskj5kgvte3p/artifacts/lc0-windows-gpu-nvidia-cuda.zip, it is a build with cuda 9.2 dlls.
@borg323 always the same error:
c:\lc0>lc0.exe | | | | | |_| v0.23.0-dev+git.6d7c1e3 built Sep 3 2019 go nodes 100 Found pb network file: ./b2ec465d0fb5b5eb39d2e1e3f74041a5d2fc92d413b71aa7ea0b6fb082ccba9c.gz Creating backend [cudnn]... error CUDA error: unknown error (..\src/neural/cuda/network_cudnn.cc:203)
@massimilianogoi this one has improved error reporting: https://ci.appveyor.com/api/buildjobs/npaav8gj954w7f8t/artifacts/build%2Flc0.exe. It is only the exe, you will need the dlls from the release zip (not the cuda 9.2 ones from the previous test).
@borg323
_
| | | | | || v0.23.0-dev+git.8210e2c built Sep 9 2019 go nodes 100 Found pb network file: ./6e404a13dab65d9b06822575e9b0a96c2984ba207b31e3fbe5e26c3 163474499 Creating backend [cudnn]... CUDA Runtime version: 566418.33.6 WARNING: CUDA Runtime version mismatch, was compiled with version 10.0.0 Cudnn version: 7.4.2 Latest version of CUDA supported by the driver: 10.1.0
Just registered to the NVidia developers program... to see they only have built Cudnn for a thousands Linux and only Windows 7 and Windows 10... -_-
Apparently the problem is that I have Windows 8.1 then...
It's a pity, since Lc0 worked since some months ago... Anyway I would set as official the one with the improved error reporting.
The cuda dlls included with lc0 are the windows 10 ones. Maybe a system update changed something that affects compatibility recently. Assuming the cuda you installed was the windows 8.1 version, can you replace cudart.dll and cublas.dll in the lc0 directory with the ones from the cuda installation and try again?
@massimilianogoi can you try https://ci.appveyor.com/api/buildjobs/cue9tlx2qpxqcmoh/artifacts/build%2Flc0.exe? It has some additional diagnostics to see why you get the strange cuda version.
C:\lc0>lc0.exe | | | | | |_| v0.23.0-dev+git.e0f705f built Sep 17 2019 go nodes 100 Found pb network file: ./6e404a13dab65d9b06822575e9b0a96c2984ba207b31e3fbe5e26c3 163474499 Creating backend [cudnn]... CUDA Runtime version: 60470.53.6 WARNING: CUDA Runtime version mismatch, was compiled with version 10.0.0 Cudnn version: 7.4.2 Latest version of CUDA supported by the driver: 10.1.0
Trying to substitute the CUDA dll file gives me a type mismatch error.
Can you tell me which cuda version is installed? Is is 10.1.243 for windows 8.1? I can try to make a build with the exact same version.
Thanks. I copy the full string:
NVCUDA.DLL 26.21.14.3615 NVIDIA CUDA 10.1.0 driver
@massimilianogoi did you ever resolve the issue, and can you try again with current versions again please?
Closing for now as there's no activity on this issue, feel free to reopen if there are any updates.
When I try to launch the program and send the command go nodes 100 it hangs with this error:
error CUDA error: unknown error (c:\projects\lc0\src\neural\cuda\network_cudnn.cc:203)
I have an Nvidia GeForce 780 Ti OC. Updated to the newest drivers, installed the CUDA toolkit, nothing to do...
The program ran smoothly (CUDA flavour) until a couple of weeks ago...