LeelaChessZero / lc0

The rewritten engine, originally for tensorflow. Now all other backends have been ported here.
GNU General Public License v3.0
2.41k stars 526 forks source link

move to cuda 11.8 with install script #2009

Open borg323 opened 6 months ago

borg323 commented 6 months ago

Turns out the following analysis is not correct - the latest version seems to work after compiling both common and fp16 only cuda code with -arch=all-major.

Made draft as #2015 has the bits to fix the NaN issues (a bit more refined) without the cuda version update. Will make this into just a cuda version update PR at a later stage.

The recent rc1 issues seem to go away if we compile the cuda fp16 code with -arch=all-major, so I added it to meson.build. It is added unconditionally since the default is still to use -arch=native and the alternative is an attempt to do the equivalent to -arch=all-major for cuda versions that don't support it. This requires at least cuda version 11.5, but as we tested with 11.8 I used this Updates cuda to 11.8 for the appveyor cuda builds (cudnn unchanged), and given the huge size of the dlls I added an install script based on the directml one. While testing I also found a bug with the directml install script, probably some recent windows security change makes executables in the same directory unavailable when running it by double clicking, so I removed the lc0.exe check and it will directly install the dlls in the script's directory.