popojan / goban

3D igo/baduk/weiqi/go game board and GUI for GnuGo and other GTP engines ray traced by GLSL shader
https://hraj.si/goban
GNU General Public License v3.0
47 stars 3 forks source link

18b katago weights can't be loaded. #45

Open luosonggu opened 3 months ago

luosonggu commented 3 months ago

I have tried cuda and opencl engine, both got this message. [2024-08-26 19:24:07.457] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-26 19:24:07.463] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-26 19:24:07.466] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-26 19:24:15.135] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.137] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.139] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.141] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.143] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.146] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.148] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.151] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.153] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.155] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.158] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.161] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.163] [multi_sink] [error] setting boardsize failed

popojan commented 3 months ago

Thank you for reporting. The error message is not very informative indeed. I can reproduce this error when katago cannot find the weights file. It is not distributed along with the application. Given e.g. this sample disabled config

{
      "name": "Katago #kata9x9 b18",
      "path": "./engine/katago",
      "command": "katago",
      "parameters": "gtp -model ./engine/katago/kata9x9-b18c384nbt-20231025.bin.gz -config ./engine/katago/default_gtp.cfg",
      "enabled": 0,
      "kibitz": 0,
      "messages": [
          {
            "regex": "^:\\s+T.*--\\s*([A-Z0-9]+)",
            "output": "$1",
            "var": "$primaryMove"
            },
          {
            "regex": "^$primaryMove.*(W\\s+[^\\s]+).*\\(\\s*([^\\s]+\\s+L)",
            "output": "$1 $2"
          },
          {
            "regex": "Controller:",
            "output": " "
          }
      ]
    },

The file kata9x9-b18c384nbt-20231025.bin.gz is expected in the same directory as katago executable. You can freely edit this config file, as briefly described in wiki#bots. You might also want to edit default_gtp.cfg to enable logging to stderr, as the logs are parsed and katago ranking shown thanks to the predefined regexes.

If all this seems correct, please try to run katago as a standalone app using the same parameters as in the config file, or copy & paste the engine configuration here. Thank you.

luosonggu commented 3 months ago

I've got the reason. My display card is nvidia gt730, the display driver is 474. I tested another machine with a 1080ti card, goban run fine. Can you check the code and make goban support old cards, in other gui like sabaki, nvidia gt730 can load 18b weights.

popojan commented 3 months ago

I see, might be a bug, could you please suggest:

I may need to reproduce the problem.

luosonggu commented 3 months ago

To reproduce the problem, you need an old nvidia card (driver version 474 and below). I have tried many 18b weights and some weights trained by lionfenfen, so I'm sure that 9x9 weights are not support by goban. Some people encountered the same problem and someone posted a post at lightvector's katago issues and got no answer.

luosonggu commented 3 months ago

https://github.com/lightvector/KataGo/issues/924

luosonggu commented 3 months ago

Every version of goban get the same problem. I guess the problem is caused by nvidia driver. Old display card can't upgrade its driver to 5 series. On my 1080ti with driver version 5xx, all is fine.

luosonggu commented 3 months ago

But, with nvidia 474 driver, sabaki can load 18b weights smoothly. So, the problem may be, new 9x9 weights need new gui code to fit old nvidia driver demands.

popojan commented 3 months ago

We need to address the errors reported by katago, if any. To me it seems it's lightvector who did not get answers in the linked issue.

You may run goban.exe -v debug to get more information in the last_run.log. If katago is configured to log into stderr, the stderr output should be included as well.

I am sorry I cannot test with the hardware mentioned in the near future.

lj739 commented 3 months ago

[2024-08-28 08:41:27.559] [multi_sink] [debug] Loaded font face Lacuna Regular Regular (from byte stream). [2024-08-28 08:41:27.587] [multi_sink] [debug] Loaded font face Lacuna Italic Regular (from byte stream). [2024-08-28 08:41:27.597] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-28 08:41:27.605] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-28 08:41:27.612] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-28 08:41:27.617] [multi_sink] [info] Loading font file [./data/fonts/Delicious-Roman.otf] [2024-08-28 08:41:27.622] [multi_sink] [debug] Loaded font face Delicious Roman (from ./data/fonts/Delicious-Roman.otf). [2024-08-28 08:41:27.625] [multi_sink] [info] Loading font file [./data/fonts/Delicious-Bold.otf] [2024-08-28 08:41:27.630] [multi_sink] [debug] Loaded font face Delicious Bold (from ./data/fonts/Delicious-Bold.otf). [2024-08-28 08:41:27.653] [multi_sink] [info] Preloading sounds... [2024-08-28 08:41:27.708] [multi_sink] [info] Loading font file [./data/fonts/default-font.ttf] [2024-08-28 08:41:27.712] [multi_sink] [debug] Creating overlay buffer[0] [2024-08-28 08:41:27.713] [multi_sink] [debug] Adding text glyphs[0] [2024-08-28 08:41:27.718] [multi_sink] [debug] gid 19: endpoints 38; err 50; tex fetch 3.2; mem 2.9kb [2024-08-28 08:41:27.719] [multi_sink] [debug] gid 20: endpoints 7; err 0; tex fetch 1.6; mem 1.2kb [2024-08-28 08:41:27.721] [multi_sink] [debug] gid 21: endpoints 30; err 95; tex fetch 2.8; mem 2.6kb [2024-08-28 08:41:27.725] [multi_sink] [debug] gid 22: endpoints 50; err 76; tex fetch 3.7; mem 3.4kb [2024-08-28 08:41:27.727] [multi_sink] [debug] gid 23: endpoints 18; err 0; tex fetch 1.9; mem 2.2kb [2024-08-28 08:41:27.733] [multi_sink] [debug] gid 24: endpoints 38; err 85; tex fetch 3.3; mem 2.9kb [2024-08-28 08:41:27.740] [multi_sink] [debug] gid 25: endpoints 46; err 69; tex fetch 3.6; mem 3.3kb [2024-08-28 08:41:27.745] [multi_sink] [debug] gid 26: endpoints 16; err 61; tex fetch 2.3; mem 2.3kb [2024-08-28 08:41:27.753] [multi_sink] [debug] gid 27: endpoints 52; err 99; tex fetch 3.9; mem 3.4kb [2024-08-28 08:41:27.760] [multi_sink] [debug] gid 28: endpoints 47; err 51; tex fetch 3.6; mem 3.2kb [2024-08-28 08:41:27.778] [multi_sink] [debug] gid 59: endpoints 13; err 0; tex fetch 1.8; mem 2.6kb [2024-08-28 08:41:27.778] [multi_sink] [debug] Creating overlay buffer[1] [2024-08-28 08:41:27.779] [multi_sink] [debug] Adding text glyphs[1] [2024-08-28 08:41:27.780] [multi_sink] [debug] Creating overlay buffer[2] [2024-08-28 08:41:27.780] [multi_sink] [debug] Adding text glyphs[2] [2024-08-28 08:41:27.780] [multi_sink] [debug] 11 glyphs; avg num endpoints 32.27; avg error 53.1;avg tex fetch 2.89; avg 2.73kb per glyph [2024-08-28 08:41:27.781] [multi_sink] [debug] sound ./data/sound/collision.wav frame count 22050 [2024-08-28 08:41:27.782] [multi_sink] [debug] sound ./data/sound/stone.wav frame count 5762 [2024-08-28 08:41:28.191] [multi_sink] [debug] setting gamma = 1.0 [2024-08-28 08:41:28.192] [multi_sink] [debug] setting contrast = 0.0 [2024-08-28 08:41:28.193] [multi_sink] [info] Starting GTP client [./engine/gnugo/gnugo] [2024-08-28 08:41:28.252] [multi_sink] [info] About to run GTP engine [./engine/gnugo/gnugo.exe] [2024-08-28 08:41:28.253] [multi_sink] [info] running child [./engine/gnugo/gnugo.exe --mode gtp --japanese-rules] [2024-08-28 08:41:28.264] [multi_sink] [info] Setting [GNU Go 3.8] engine as coach and referee. [2024-08-28 08:41:28.264] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = false] [2024-08-28 08:41:28.265] [multi_sink] [info] Starting GTP client [./engine/katago/katago] [2024-08-28 08:41:28.325] [multi_sink] [info] About to run GTP engine [./engine/katago/katago.exe] [2024-08-28 08:41:28.326] [multi_sink] [info] running child [./engine/katago/katago.exe gtp -model ./engine/katago/b18c384nbt-optimisticv13-s5971M.bin.gz -config ./engine/katago/default_gtp.cfg] [2024-08-28 08:41:28.336] [multi_sink] [info] Setting [Katago #kata9x9 b18] engine as trusted kibitz. [2024-08-28 08:41:28.337] [multi_sink] [debug] Player[1] newType = [human = false, computer = true] newRole = [black = false, white = false] [2024-08-28 08:41:28.338] [multi_sink] [info] gnugo << boardsize 19 [2024-08-28 08:41:28.339] [multi_sink] [info] getting response... [2024-08-28 08:41:28.343] [multi_sink] [info] gnugo >> = [2024-08-28 08:41:28.343] [multi_sink] [info] gnugo >> [2024-08-28 08:41:28.344] [multi_sink] [info] gnugo << clear_board [2024-08-28 08:41:28.345] [multi_sink] [info] getting response... [2024-08-28 08:41:28.346] [multi_sink] [info] gnugo >> = [2024-08-28 08:41:28.347] [multi_sink] [info] gnugo >> [2024-08-28 08:41:28.348] [multi_sink] [info] katago << boardsize 19 [2024-08-28 08:41:28.349] [multi_sink] [info] getting response... [2024-08-28 08:41:28.665] [multi_sink] [debug] gtp err = KataGo v1.13.0

[2024-08-28 08:41:28.666] [multi_sink] [debug] gtp err = Using TrompTaylor rules initially, unless GTP/GUI overrides this

[2024-08-28 08:41:35.005] [multi_sink] [info] katago << clear_board [2024-08-28 08:41:35.006] [multi_sink] [info] getting response... [2024-08-28 08:41:35.007] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = false, white = false] [2024-08-28 08:41:35.008] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = true, white = false] [2024-08-28 08:41:35.009] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = false] [2024-08-28 08:41:35.010] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = true] [2024-08-28 08:41:35.064] [multi_sink] [debug] gtp err = [2024-08-28 08:41:35.326] [multi_sink] [debug] Load [2024-08-28 08:41:35.330] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = false, white = false] [2024-08-28 08:41:35.330] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = true, white = false] [2024-08-28 08:41:35.332] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = false] [2024-08-28 08:41:35.332] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = true] [2024-08-28 08:41:35.340] [multi_sink] [info] switching shader to #0 [2024-08-28 08:41:35.387] [multi_sink] [debug] FPS: 0.0 [2024-08-28 08:41:35.390] [multi_sink] [info] gnugo << boardsize 19 [2024-08-28 08:41:35.390] [multi_sink] [info] getting response... [2024-08-28 08:41:35.391] [multi_sink] [info] gnugo >> = [2024-08-28 08:41:35.392] [multi_sink] [info] gnugo >> [2024-08-28 08:41:35.393] [multi_sink] [info] gnugo << clear_board [2024-08-28 08:41:35.394] [multi_sink] [info] getting response... [2024-08-28 08:41:35.395] [multi_sink] [info] gnugo >> = [2024-08-28 08:41:35.396] [multi_sink] [info] gnugo >> [2024-08-28 08:41:35.397] [multi_sink] [info] katago << boardsize 19 [2024-08-28 08:41:35.398] [multi_sink] [info] getting response...

popojan commented 3 months ago

@lj739 Thank you. And that's all? Then goban quits or seemingly hangs?

lj739 commented 3 months ago

goban quits, last_run.log these message apears: [2024-08-26 19:24:07.457] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-26 19:24:07.463] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-26 19:24:07.466] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34. [2024-08-26 19:24:15.135] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.137] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.139] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.141] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.143] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.146] [multi_sink] [error] setting boardsize failed [2024-08-26 19:24:15.148] [multi_sink] [error] setting boardsize failed

popojan commented 3 months ago

The first one for goban.exe -v debug looks incomplete or truncated, compared to the second one with default verbosity.

I could investigate further if you'd be so kind to run katago standalone, ie.

katago.exe gtp -model b18c384nbt-optimisticv13-s5971M.bin.gz -config default_gtp.cfg

let it genmove, and attach the logged output. Thank you in advance.

luosonggu commented 3 months ago

2024-08-28 22:36:16+0800: Running with following config: allowResignation = true defaultBoardSize = 19 lagBuffer = 1.0 logAllGTPCommunication = true logDir = gtp_logs logSearchInfo = true logSearchInfoForChosenMove = true logToStderr = true maxTimePondering = 60.0 maxVisits = 100 numSearchThreads = 6 ponderingEnabled = false resignConsecTurns = 3 resignThreshold = -0.999 rules = tromp-taylor searchFactorAfterOnePass = 0.50 searchFactorAfterTwoPass = 0.25 searchFactorWhenWinning = 0.40 searchFactorWhenWinningThreshold = 0.95

2024-08-28 22:36:16+0800: GTP Engine starting... 2024-08-28 22:36:16+0800: KataGo v1.15.2 2024-08-28 22:36:16+0800: Using TrompTaylor rules initially, unless GTP/GUI overrides this 2024-08-28 22:36:16+0800: Using 6 CPU thread(s) for search 2024-08-28 22:36:16+0800: nnRandSeed0 = 8178244719051708548 2024-08-28 22:36:16+0800: After dedups: nnModelFile0 = kata1-b18c384nbt-s9996604416-d4316597426.bin.gz useFP16 auto useNHWC auto 2024-08-28 22:36:16+0800: Initializing neural net buffer to be size 19 * 19 exactly 2024-08-28 22:36:20+0800: Found OpenCL Platform 0: NVIDIA CUDA (NVIDIA Corporation) (OpenCL 3.0 CUDA 11.4.309) 2024-08-28 22:36:20+0800: Found 1 device(s) on platform 0 with type CPU or GPU or Accelerator 2024-08-28 22:36:20+0800: Found OpenCL Device 0: NVIDIA GeForce GT 730 (NVIDIA Corporation) (score 11000300) 2024-08-28 22:36:20+0800: Creating context for OpenCL Platform: NVIDIA CUDA (NVIDIA Corporation) (OpenCL 3.0 CUDA 11.4.309) 2024-08-28 22:36:20+0800: Using OpenCL Device 0: NVIDIA GeForce GT 730 (NVIDIA Corporation) OpenCL 3.0 CUDA (Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info) 2024-08-28 22:36:20+0800: Loaded tuning parameters from: D:\Goban2024a\engine\katagopencl/KataGoData/opencltuning/tune11_gpuNVIDIAGeForceGT730_x19_y19_c384_mv14.txt 2024-08-28 22:36:20+0800: OpenCL backend thread 0: Model version 14 2024-08-28 22:36:20+0800: OpenCL backend thread 0: Model name: kata1-b18c384nbt-s9996604416-d4316597426 2024-08-28 22:36:27+0800: OpenCL backend thread 0: FP16Storage false FP16Compute false FP16TensorCores false FP16TensorCoresFor1x1 false 2024-08-28 22:36:27+0800: Loaded neural net with nnXLen 19 nnYLen 19 2024-08-28 22:36:27+0800: Initializing board with boardXSize 19 boardYSize 19 2024-08-28 22:36:27+0800: Loaded config default_gtp.cfg 2024-08-28 22:36:27+0800: Loaded model kata1-b18c384nbt-s9996604416-d4316597426.bin.gz 2024-08-28 22:36:27+0800: Model name: kata1-b18c384nbt-s9996604416-d4316597426 2024-08-28 22:36:27+0800: GTP ready, beginning main protocol loop

luosonggu commented 3 months ago

PS D:\Goban2024\engine\katagocuda> .\katago.exe gtp -model kata1-b18c384nbt-s9996604416-d4316597426.bin.gz -config default_gtp.cfg 2024-08-28 22:46:29+0800: Running with following config: allowResignation = true cudaUseFP16 = false cudaUseNHWC = false lagBuffer = 1.0 logAllGTPCommunication = true logDir = gtp_logs logSearchInfo = true logToStderr = true maxTimePondering = 60.0 maxVisits = 50 numSearchThreads = 6 ponderingEnabled = false resignConsecTurns = 3 resignThreshold = -0.999 rules = tromp-taylor searchFactorAfterOnePass = 0.50 searchFactorAfterTwoPass = 0.25 searchFactorWhenWinning = 0.40 searchFactorWhenWinningThreshold = 0.95

2024-08-28 22:46:29+0800: GTP Engine starting... 2024-08-28 22:46:29+0800: KataGo v1.13.0 2024-08-28 22:46:29+0800: Using TrompTaylor rules initially, unless GTP/GUI overrides this 2024-08-28 22:46:29+0800: Using 6 CPU thread(s) for search 2024-08-28 22:46:30+0800: nnRandSeed0 = 16223772968998217411 2024-08-28 22:46:30+0800: After dedups: nnModelFile0 = kata1-b18c384nbt-s9996604416-d4316597426.bin.gz useFP16 false useNHWC false 2024-08-28 22:46:30+0800: Initializing neural net buffer to be size 19 * 19 exactly 2024-08-28 22:46:33+0800: Cuda backend thread 0: Found GPU NVIDIA GeForce GT 730 memory 1073741824 compute capability major 3 minor 5 2024-08-28 22:46:33+0800: Cuda backend thread 0: Model version 14 useFP16 = false useNHWC = false 2024-08-28 22:46:33+0800: Cuda backend thread 0: Model name: kata1-b18c384nbt-s9996604416-d4316597426 2024-08-28 22:46:46+0800: Loaded neural net with nnXLen 19 nnYLen 19 2024-08-28 22:46:46+0800: Initializing board with boardXSize 19 boardYSize 19 2024-08-28 22:46:46+0800: Loaded config default_gtp.cfg 2024-08-28 22:46:46+0800: Loaded model kata1-b18c384nbt-s9996604416-d4316597426.bin.gz 2024-08-28 22:46:46+0800: Model name: kata1-b18c384nbt-s9996604416-d4316597426 2024-08-28 22:46:46+0800: GTP ready, beginning main protocol loop

luosonggu commented 3 months ago

I don't know how to genmove in the powershell.

popojan commented 3 months ago

@luosonggu When GTP is ready issue gtp command genmove B, but it is obvious it would work, so these logs are enough for now. Thank you.