lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.56k stars 564 forks source link

About CUDA version and TRT version of 1.14.0 #870

Open hope366 opened 10 months ago

hope366 commented 10 months ago

Thank you for your efforts in the development of katago. Also, thank you for letting me use Katago for free.

The cuda version of 1.14.0 worked with the following three libraries that I had been using since before 1.13.0.

Until now, when I used the cuda version, in addition to the three above, I also placed the following three in the same folder. If the above three work, are the bottom three unnecessary?

1.14.0-trt did not work with dependent TensorRT-8.5.2.2. I prepared a new TensorRT-8.6.1.6 and it worked. Compared to 1.13.1-trt, 1.14.0-trt is about 10% faster.

lightvector commented 10 months ago

Thanks for testing!

If it works with an older CUDA you can still use the older CUDA, but I've switched my testing to be on CUDA 12 going forward, so officially that's the only one that will be recommended.

Also, when I upgraded my own machines from CUDA 11.4 to CUDA 12.1, I also got like a 10% improvement on the CUDA backend too, so there may be some benefits to upgrading as well even if an older version works.

hope366 commented 10 months ago

Also, when I upgraded my own machines from CUDA 11.4 to CUDA 12.1, I also got like a 10% improvement on the CUDA backend too, so there may be some benefits to upgrading as well even if an older version works.

Well, that's good news. I would like to upgrade to CUDA12.1 and observe how the speed changes.

hope366 commented 10 months ago

I have prepared CUDA and CUDNN as recommended on the release page.

The following files were placed in the katago-cuda folder.

無題

I ran a benchmark test to compare it to my previous environment, and the speed improvement was only around 3%. The CUDA and CUDNN that I have been using so far were borrowed from Megapack, but they may not be that old. That may be why I didn't see a significant speed improvement.

Are the six files shown in the image above necessary and sufficient to run the newly released katago-cuda?

hope366 commented 10 months ago

I said that the speedup rate is 3%, but it changes depending on the number of threads used, and it ranges from -1 to +11%. In most cases, the increase was between 3% and 5%.