lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.56k stars 564 forks source link

v1.13.1 TensorRT katago failed to use or generate trtcache when genconfig #849

Open Gaogao417 opened 1 year ago

Gaogao417 commented 1 year ago

Commands:

katago_tensorRT\katago.exe genconfig -model "weights\kata1-b40c256-s11840935168-d2898845681.bin.gz" -output "katago_configs\v130_40b.cfg"

Outputs:

========================================================================= RULES

What rules should KataGo use by default for play and analysis? (chinese, japanese, korean, tromp-taylor, aga, chinese-ogs, new-zealand, bga, stone-scoring, aga-button): chinese

========================================================================= SEARCH LIMITS

When playing games, KataGo will always obey the time controls given by the GUI/tournament/match/online server. But you can specify an additional limit to make KataGo move much faster. This does NOT affect analysis/review, only affects playing games. Add a limit? (y/n) (default n):

NOTE: No limits configured for KataGo. KataGo will obey time controls provided by the GUI or server or match script but if they don't specify any, when playing games KataGo may think forever without moving. (press enter to continue)

When playing games, KataGo can optionally ponder during the opponent's turn. This gives faster/stronger play in real games but should NOT be enabled if you are running tests with fixed limits (pondering may exceed those limits), or to avoid stealing the opponent's compute time when testing two bots on the same machine. Enable pondering? (y/n, default n):

========================================================================= GPUS AND RAM

Finding available GPU-like devices... Found GPU device 0: NVIDIA GeForce RTX 4060 Laptop GPU

Specify devices/GPUs to use (for example "0,1,2" to use devices 0, 1, and 2). Leave blank for a default SINGLE-GPU config: 0

By default, KataGo will cache up to about 3GB of positions in memory (RAM), in addition to whatever the current search is using. Specify a different max in GB or leave blank for default:

========================================================================= PERFORMANCE TUNING

Specify number of visits to use test/tune performance with, leave blank for default based on GPU speed. Use large number for more accurate results, small if your GPU is old and this is taking forever:

Specify number of seconds/move to optimize performance for (default 5), leave blank for default:

2023-10-17 15:23:57+0800: Running with following config: allowResignation = true friendlyPassOk = true hasButton = false koRule = SIMPLE lagBuffer = 1.0 logAllGTPCommunication = true logDir = gtp_logs logSearchInfo = true logToStderr = false multiStoneSuicideLegal = false nnCacheSizePowerOfTwo = 20 nnMutexPoolSizePowerOfTwo = 16 numNNServerThreadsPerModel = 1 numSearchThreads = 6 ponderingEnabled = false resignConsecTurns = 3 resignThreshold = -0.90 scoringRule = AREA searchFactorAfterOnePass = 0.50 searchFactorAfterTwoPass = 0.25 searchFactorWhenWinning = 0.40 searchFactorWhenWinningThreshold = 0.95 taxRule = NONE trtDeviceToUseThread0 = 0 whiteHandicapBonus = N

2023-10-17 15:23:57+0800: Loading model and initializing benchmark...

Running quick initial benchmark at 16 threads! 2023-10-17 15:23:57+0800: nnRandSeed0 = 286467653869866396 2023-10-17 15:23:57+0800: After dedups: nnModelFile0 = weights\kata1-b40c256-s11840935168-d2898845681.bin.gz useFP16 auto useNHWC auto 2023-10-17 15:23:57+0800: Initializing neural net buffer to be size 19 * 19 exactly 2023-10-17 15:23:59+0800: TensorRT backend thread 0: Found GPU NVIDIA GeForce RTX 4060 Laptop GPU memory 8585216000 compute capability major 8 minor 9
2023-10-17 15:23:59+0800: TensorRT backend thread 0: Initializing (may take a long time)

Harald-Han commented 11 months ago

I'm also experiencing the same problem. I have confirmed that TensorRT is installed properly by running one of its sample programs included in the distribution zip file. FYI, the CUDA version of katago, v1.13.2, works fine on the same PC.

lightvector commented 11 months ago

There's a chance that your TensorRT and CUDA versions are too recent. I suspect the NVIDIA libraries are not always backwards compatible, and the precompiled exes are using older versions that you can see here: https://github.com/lightvector/KataGo/releases/tag/v1.13.1

Within the next few weeks there should be a minor release that updates a few other things, and I can upgrade the CUDA and TensorRT versions used at that point too.