Turns out the following analysis is not correct - the latest version seems to work after compiling both common and fp16 only cuda code with -arch=all-major.
Made draft as #2015 has the bits to fix the NaN issues (a bit more refined) without the cuda version update. Will make this into just a cuda version update PR at a later stage.
The recent rc1 issues seem to go away if we compile the cuda fp16 code with -arch=all-major, so I added it to meson.build. It is added unconditionally since the default is still to use -arch=native and the alternative is an attempt to do the equivalent to -arch=all-major for cuda versions that don't support it.This requires at least cuda version 11.5, but as we tested with 11.8 I used this
Updates cuda to 11.8 for the appveyor cuda builds (cudnn unchanged), and given the huge size of the dlls I added an install script based on the directml one.
While testing I also found a bug with the directml install script, probably some recent windows security change makes executables in the same directory unavailable when running it by double clicking, so I removed the lc0.exe check and it will directly install the dlls in the script's directory.
Turns out the following analysis is not correct - the latest version seems to work after compiling both common and fp16 only cuda code with -arch=all-major.
Made draft as #2015 has the bits to fix the NaN issues (a bit more refined) without the cuda version update. Will make this into just a cuda version update PR at a later stage.
The recent rc1 issues seem to go away if we compile the cuda fp16 code with-arch=all-major
, so I added it tomeson.build
. It is added unconditionally since the default is still to use-arch=native
and the alternative is an attempt to do the equivalent to-arch=all-major
for cuda versions that don't support it.This requires at least cuda version 11.5, but as we tested with 11.8 I used thisUpdates cuda to 11.8 for the appveyor cuda builds (cudnn unchanged), and given the huge size of the dlls I added an install script based on the directml one. While testing I also found a bug with the directml install script, probably some recent windows security change makes executables in the same directory unavailable when running it by double clicking, so I removed the lc0.exe check and it will directly install the dlls in the script's directory.