online-go / gtp2ogs

GTP Wrapper to allow bots to interface with the Online-Go.com Server
MIT License
88 stars 28 forks source link

weird phoenixgo exit and timeout at move 3 (reproductible) #190

Closed wonderingabout closed 5 years ago

wonderingabout commented 5 years ago

no idea why it happened, but i am doing other things, however here is the log and game link

https://online-go.com/game/17708541

using this branch : https://github.com/wonderingabout/gtp2ogs/tree/windo-crashloop-import

May 03 20:50:50   connection.js:1391             #  Connecting to https://online-go.com:443
May 03 20:50:50   connection.js:1391             #  Connected
May 03 20:50:50   connection.js:1391             #  Bot is username: meta-金毛测试-20b
May 03 20:50:50   connection.js:1391             #  Bot is user id: 592558
May 03 20:51:24   connection.js:1391             #  active_game: {"id":17708541,"phase":"play","name":"10000 vs 3200 test","player_to_move":592558,"width":19,"height":19,"move_number":0,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 20:51:24   game.js:490                    [Game 17708541]            Connecting to game.
May 03 20:51:24   game.js:490                    [Game 17708541]            gamedata       B RoyalZero  [19x19]    
May 03 20:51:24   game.js:490                    [Game 17708541]            makeMove 0 is 0 !== 0 ?
May 03 20:51:24   bot.js:214                     [23183] Starting  /home/amd2019/PhoenixGo/bazel-bin/mcts/mcts_main --gtp --config_path=/home/amd2019/PhoenixGo/etc/mcts_1gpu.conf --v=0
May 03 20:51:24   game.js:490                    [Game 17708541]            Starting new bot process [23183]
May 03 20:51:24   game.js:490                    [Game 17708541]            State loading for new bot
May 03 20:51:24   bot.js:214                     [23183] >>> boardsize 19
May 03 20:51:24   bot.js:214                     [23183] >>> clear_board
May 03 20:51:24   bot.js:214                     [23183] >>> komi 7.5
May 03 20:51:24   bot.js:214                     [23183] Generating move for game 17708541
May 03 20:51:24   game.js:490                    [Game 17708541]            genmove black
May 03 20:51:24   bot.js:214                     [23183] >>> genmove black
May 03 20:51:24   game.js:490                    [Game 17708541]            clock: {"game_id":17708541,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909484753,"expiration":1556909784753,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"start_mode":true,"now":1556909484919}
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.291253: I model/zero_model.cc:72] Read checkpoint state succ
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.311571: I model/zero_model.cc:80] Read meta graph succ
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.483924: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.484465: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7335
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: pciBusID: 0000:1c:00.0
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: totalMemory: 5.93GiB freeMemory: 3.76GiB
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.484483: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
May 03 20:51:25   game.js:490                    [Game 17708541]            gamedata       B RoyalZero  [19x19]    
May 03 20:51:25   game.js:490                    [Game 17708541]            makeMove 0 is 0 !== 0 ?
May 03 20:51:25   bot.js:214                     [23183] Generating move for game 17708541
May 03 20:51:25   game.js:490                    [Game 17708541]            genmove black
May 03 20:51:25   bot.js:214                     [23183] >>> genmove black
May 03 20:51:25   game.js:490                    [Game 17708541]            clock: {"game_id":17708541,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909484753,"expiration":1556909784753,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"start_mode":true,"now":1556909485564}
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.730459: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.730493: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.730500: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.731904: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3035 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:1c:00.0, compute capability: 6.1)
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.733404: I model/zero_model.cc:100] Create session succ
May 03 20:51:25 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:25.790189: I model/zero_model.cc:107] Create graph succ
May 03 20:51:27 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:27.128160: I model/zero_model.cc:119] Load checkpoint succ
May 03 20:51:27 !!!!! bot.js:222                     [23183] stderr: 2019-05-03 20:51:27.457016: I external/org_tensorflow/tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
May 03 20:51:29   bot.js:214                     [23183] <<< =
May 03 20:51:29   bot.js:214                     [23183] <<< =
May 03 20:51:29   bot.js:214                     [23183] <<< =
May 03 20:51:50   connection.js:1391             #  active_game: {"id":17638626,"phase":"finished","name":"3200 vs 3200 gtp2ogs-v.507","player_to_move":558144,"width":19,"height":19,"move_number":242,"paused":0,"private":false,"black":{"username":"RoyalMinigo","id":558144,"rank":24,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":90.38563018383769,"rating":2687.544228896897,"volatility":0.05933087600868436,"games_played":200}}},"white":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"time_per_move":31}
May 03 20:51:51   connection.js:1391             #  active_game: {"id":17626084,"phase":"finished","name":"6400 vs 6400","player_to_move":625990,"width":19,"height":19,"move_number":176,"paused":0,"private":false,"black":{"username":"20bTurboLz-Elf-v1","id":625990,"rank":21,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":89.68976739844692,"rating":2816.946725891752,"volatility":0.060383919861478806,"games_played":50}}},"white":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"time_per_move":31}
May 03 20:51:53   connection.js:1391             #  active_game: {"id":17625553,"phase":"finished","name":"6400 vs 6400","player_to_move":592558,"width":19,"height":19,"move_number":292,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"20bTurboLz-Elf-v1","id":625990,"rank":21,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":89.68976739844692,"rating":2816.946725891752,"volatility":0.060383919861478806,"games_played":50}}},"time_per_move":31}
May 03 20:51:57   connection.js:1391             #  active_game: {"id":17611078,"phase":"finished","name":"3200 vs 3200","player_to_move":592558,"width":19,"height":19,"move_number":177,"paused":0,"private":false,"black":{"username":"RoyalLZ-ELF","id":533729,"rank":24,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":76.18708735382506,"rating":2668.192588089323,"volatility":0.06838573425204215,"games_played":1427}}},"white":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"time_per_move":31}
May 03 20:51:58   connection.js:1391             #  active_game: {"id":17610773,"phase":"finished","name":"3200 vs 3200","player_to_move":558144,"width":19,"height":19,"move_number":176,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalMinigo","id":558144,"rank":24,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":90.38563018383769,"rating":2687.544228896897,"volatility":0.05933087600868436,"games_played":200}}},"time_per_move":31}
May 03 20:52:13 !!!!! bot.js:222                     [23183] stderr: 1th move(b): pp, winrate=44.076649%, N=933, Q=-0.118467, p=0.080226, v=-0.116534, cost 44396.992188ms, sims=10032, height=12, avg_height=6.043810, global_step=639200
May 03 20:52:13   bot.js:214                     [23183] <<< = Q16
May 03 20:52:13   game.js:490                    [Game 17708541]            Playing q16 { x: 15, y: 3, text: 'q16', resign: false, pass: false }
May 03 20:52:13   game.js:490                    [Game 17708541]            clock: {"game_id":17708541,"current_player":484523,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909533800,"expiration":1556909833800,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556909533801}
May 03 20:52:13   game.js:490                    [Game 17708541]            game/17708541/move: { game_id: 17708541, move_number: 1, move: [ 15, 3, 49800 ] }
May 03 20:52:13   game.js:490                    [Game 17708541]  Move 1    Ignoring our own move 1
May 03 20:52:13   connection.js:1391             #  active_game: {"id":17708541,"phase":"play","name":"10000 vs 3200 test","player_to_move":484523,"width":19,"height":19,"move_number":1,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 20:52:18   game.js:490                    [Game 17708541]  Move 1    clock: {"game_id":17708541,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909538695,"expiration":1556909838695,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556909538696}
May 03 20:52:18   game.js:490                    [Game 17708541]  Move 1    game/17708541/move: { game_id: 17708541, move_number: 2, move: [ 3, 15, 4895 ] }
May 03 20:52:18   game.js:490                    [Game 17708541]  Move 2    Got     d4
May 03 20:52:18   bot.js:214                     [23183] Calling sendMove with d4
May 03 20:52:18   bot.js:214                     [23183] >>> play white d4
May 03 20:52:18   game.js:490                    [Game 17708541]  Move 2    makeMove 2 is 2 !== 2 ?
May 03 20:52:18   bot.js:214                     [23183] Generating move for game 17708541
May 03 20:52:18   game.js:490                    [Game 17708541]  Move 2    genmove black
May 03 20:52:18   bot.js:214                     [23183] >>> genmove black
May 03 20:52:18   connection.js:1391             #  active_game: {"id":17708541,"phase":"play","name":"10000 vs 3200 test","player_to_move":592558,"width":19,"height":19,"move_number":2,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 20:52:18   connection.js:1391             #  Connected to game 17708541 already
May 03 20:52:57 !!!!! bot.js:222                     [23183] stderr: 3th move(b): dd, winrate=84.019028%, N=1305, Q=0.680381, p=0.093162, v=0.601550, cost 43920.070312ms, sims=10032, height=11, avg_height=5.028920, global_step=639200
May 03 20:52:57   bot.js:214                     [23183] <<< = D4
May 03 20:52:57   game.js:490                    [Game 17708541]  Move 2    Playing d4 { x: 3, y: 15, text: 'd4', resign: false, pass: false }
May 03 20:52:57 !!!!! bot.js:222                     [23183] stderr: F0503 20:52:57.935942 23183 mcts_engine.cc:154] Check failed: ret == 0 (-1 vs. 0) Move: failed, dd, ret-1
May 03 20:52:57 !!!!! bot.js:222                     [23183] stderr: *** Check failure stack trace: ***
May 03 20:52:57 !!!!! bot.js:222                     [23183] stderr: *** Aborted at 1556909577 (unix time) try "date -d @1556909577" if you are using GNU date ***
May 03 20:52:57 !!!!! bot.js:222                     [23183] stderr: PC: @                0x0 (unknown)
May 03 20:52:58   bot.js:214                     [23183] Bot exited
May 03 20:57:18   game.js:490                    [Game 17708541]  Move 2    gamedata       B RoyalZero  [19x19]    
May 03 20:57:18   game.js:490                    [Game 17708541]  Move 2    Game over.   Result: W+Time      L
May 03 20:57:18   bot.js:214                     [23183] Stopping bot
May 03 20:57:18   game.js:463                    Starting disconnect Timeout in Game 17708541 gameOver()
May 03 20:57:18   game.js:490                    [Game 17708541]  Move 2    clock: {"game_id":17708541,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909538695,"expiration":1556909838695,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556909838837}
May 03 20:57:18   connection.js:1391             #  active_game: {"id":17708541,"phase":"finished","name":"10000 vs 3200 test","player_to_move":592558,"width":19,"height":19,"move_number":2,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 20:57:18   connection.js:1391             #  17708541 active_game phase === finished
May 03 20:57:19   connection.js:1391             #  disconnectFromGame 17708541
May 03 20:57:19   game.js:490                    [Game 17708541]  Move 2    Disconnecting from game.
May 03 20:57:23   bot.js:214                     [23183] Killing process directly with a signal
wonderingabout commented 5 years ago

wow its actually reproductible, happened again in this game

https://online-go.com/game/17708647

had to manually restart to fix

log of game 2 crash

May 03 20:59:36   game.js:490                    [Game 17708647]            Connecting to game.
May 03 20:59:37   game.js:490                    [Game 17708647]            gamedata       B RoyalZero  [19x19]    
May 03 20:59:37   game.js:490                    [Game 17708647]            makeMove 0 is 0 !== 0 ?
May 03 20:59:37   bot.js:214                     [25704] Starting  /home/amd2019/PhoenixGo/bazel-bin/mcts/mcts_main --gtp --config_path=/home/amd2019/PhoenixGo/etc/mcts_1gpu.conf --v=0
May 03 20:59:37   game.js:490                    [Game 17708647]            Starting new bot process [25704]
May 03 20:59:37   game.js:490                    [Game 17708647]            State loading for new bot
May 03 20:59:37   bot.js:214                     [25704] >>> boardsize 19
May 03 20:59:37   bot.js:214                     [25704] >>> clear_board
May 03 20:59:37   bot.js:214                     [25704] >>> komi 7.5
May 03 20:59:37   bot.js:214                     [25704] Generating move for game 17708647
May 03 20:59:37   game.js:490                    [Game 17708647]            genmove black
May 03 20:59:37   bot.js:214                     [25704] >>> genmove black
May 03 20:59:37   game.js:490                    [Game 17708647]            clock: {"game_id":17708647,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909976759,"expiration":1556910276759,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"start_mode":true,"now":1556909976959}
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.075422: I model/zero_model.cc:72] Read checkpoint state succ
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.093378: I model/zero_model.cc:80] Read meta graph succ
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.212965: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.213570: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7335
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: pciBusID: 0000:1c:00.0
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: totalMemory: 5.93GiB freeMemory: 3.68GiB
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.213585: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.306520: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.306557: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.306563: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.306770: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3035 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:1c:00.0, compute capability: 6.1)
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.307806: I model/zero_model.cc:100] Create session succ
May 03 20:59:37 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:37.356809: I model/zero_model.cc:107] Create graph succ
May 03 20:59:37   game.js:490                    [Game 17708647]            gamedata       B RoyalZero  [19x19]    
May 03 20:59:37   game.js:490                    [Game 17708647]            makeMove 0 is 0 !== 0 ?
May 03 20:59:37   bot.js:214                     [25704] Generating move for game 17708647
May 03 20:59:37   game.js:490                    [Game 17708647]            genmove black
May 03 20:59:37   bot.js:214                     [25704] >>> genmove black
May 03 20:59:37   game.js:490                    [Game 17708647]            clock: {"game_id":17708647,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556909976759,"expiration":1556910276759,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"start_mode":true,"now":1556909977613}
May 03 20:59:38 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:38.145466: I model/zero_model.cc:119] Load checkpoint succ
May 03 20:59:38 !!!!! bot.js:222                     [25704] stderr: 2019-05-03 20:59:38.459308: I external/org_tensorflow/tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
May 03 20:59:39   bot.js:214                     [25704] <<< =
May 03 20:59:39   bot.js:214                     [25704] <<< =
May 03 20:59:39   bot.js:214                     [25704] <<< =
May 03 21:00:21 !!!!! bot.js:222                     [25704] stderr: 1th move(b): pd, winrate=44.063831%, N=924, Q=-0.118723, p=0.080226, v=-0.113951, cost 42105.917969ms, sims=10032, height=12, avg_height=6.066105, global_step=639200
May 03 21:00:21   bot.js:214                     [25704] <<< = Q4
May 03 21:00:21   game.js:490                    [Game 17708647]            Playing q4 { x: 15, y: 15, text: 'q4', resign: false, pass: false }
May 03 21:00:21   connection.js:1391             #  active_game: {"id":17708647,"phase":"play","name":"10000 vs 3200 test","player_to_move":484523,"width":19,"height":19,"move_number":1,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 21:00:21   game.js:490                    [Game 17708647]            clock: {"game_id":17708647,"current_player":484523,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556910021794,"expiration":1556910321794,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556910021795}
May 03 21:00:21   game.js:490                    [Game 17708647]            game/17708647/move: { game_id: 17708647, move_number: 1, move: [ 15, 15, 45794 ] }
May 03 21:00:21   game.js:490                    [Game 17708647]  Move 1    Ignoring our own move 1
May 03 21:00:26   connection.js:1391             #  active_game: {"id":17708647,"phase":"play","name":"10000 vs 3200 test","player_to_move":592558,"width":19,"height":19,"move_number":2,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 21:00:26   connection.js:1391             #  Connected to game 17708647 already
May 03 21:00:26   game.js:490                    [Game 17708647]  Move 1    clock: {"game_id":17708647,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556910026765,"expiration":1556910326765,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556910026766}
May 03 21:00:26   game.js:490                    [Game 17708647]  Move 1    game/17708647/move: { game_id: 17708647, move_number: 2, move: [ 3, 3, 4970 ] }
May 03 21:00:26   game.js:490                    [Game 17708647]  Move 2    Got     d16
May 03 21:00:26   bot.js:214                     [25704] Calling sendMove with d16
May 03 21:00:26   bot.js:214                     [25704] >>> play white d16
May 03 21:00:26   game.js:490                    [Game 17708647]  Move 2    makeMove 2 is 2 !== 2 ?
May 03 21:00:26   bot.js:214                     [25704] Generating move for game 17708647
May 03 21:00:26   game.js:490                    [Game 17708647]  Move 2    genmove black
May 03 21:00:26   bot.js:214                     [25704] >>> genmove black
May 03 21:01:02 !!!!! bot.js:222                     [25704] stderr: 3th move(b): dp, winrate=83.961624%, N=1261, Q=0.679232, p=0.093162, v=0.627675, cost 40225.292969ms, sims=10032, height=11, avg_height=5.029628, global_step=639200
May 03 21:01:02   bot.js:214                     [25704] <<< = D16
May 03 21:01:02   game.js:490                    [Game 17708647]  Move 2    Playing d16 { x: 3, y: 3, text: 'd16', resign: false, pass: false }
May 03 21:01:02 !!!!! bot.js:222                     [25704] stderr: F0503 21:01:02.230954 25704 mcts_engine.cc:154] Check failed: ret == 0 (-1 vs. 0) Move: failed, dp, ret-1
May 03 21:01:02 !!!!! bot.js:222                     [25704] stderr: *** Check failure stack trace: ***
May 03 21:01:02 !!!!! bot.js:222                     [25704] stderr: *** Aborted at 1556910062 (unix time) try "date -d @1556910062" if you are using GNU date ***
May 03 21:01:02 !!!!! bot.js:222                     [25704] stderr: PC: @                0x0 (unknown)
May 03 21:01:02   bot.js:214                     [25704] Bot exited
^C
amd2019@amd2019-MS-7B86:~/PhoenixGo/etc$ sudo nvidia-smi -pm 1 && sudo nvidia-smi -pl 75 && nvidia-smi -i 0 -q -d POWER,CLOCK && cd ~/PhoenixGo/etc && sudo nano mcts_1gpu.conf && sudo node /usr/lib/node_modules/gtp2ogs/gtp2ogs.js  --username meta-金毛测试-20b --debug --maxconnectedgamesperuser 1 --maxconnectedgames 1 --boardsizes 19 --noclock --persist --nopause --fakerank 7d --maxhandicap 0 --komis 7.5,automatic --minrank 4k --unrankedonly --timecontrols fischer,byoyomi --speeds live --minmaintimelive 60 --maxmaintimelive 600 --minperiodtimelive 30 --maxperiodtimelive 180 --maxperiodslive 5 --greeting "hi, now using enable background search (pondering) for faster play, and the new tensorflow 1.13.1 batch size 16 for faster play, this bot uses the original phoenixgo engine provided by tencent https://github.com/Tencent/PhoenixGo 20 blocks version , not leela zero, have fun :)" --rejectnew --rejectnewmsg "doing some testing for today, see you next time :) " --farewell "thank you for the game, hope you enjoyed and/or learned :) there is victory in defeat :) if you want my bot to play a custom opening position against any bot, send it to me here (forked games) https://online-go.com/review/363869" -- /home/amd2019/PhoenixGo/bazel-bin/mcts/mcts_main --gtp --config_path=/home/amd2019/PhoenixGo/etc/mcts_1gpu.conf --v=0
Persistence mode is already Enabled for GPU 00000000:1C:00.0.
All done.
Power limit for GPU 00000000:1C:00.0 was set to 75.00 W from 75.00 W.
All done.

==============NVSMI LOG==============

Timestamp                           : Fri May  3 21:03:40 2019
Driver Version                      : 410.104
CUDA Version                        : 10.0

Attached GPUs                       : 1
GPU 00000000:1C:00.0
    Power Readings
        Power Management            : Supported
        Power Draw                  : 28.54 W
        Power Limit                 : 75.00 W
        Default Power Limit         : 120.00 W
        Enforced Power Limit        : 75.00 W
        Min Power Limit             : 60.00 W
        Max Power Limit             : 140.00 W
    Power Samples
        Duration                    : 2.37 sec
        Number of Samples           : 119
        Max                         : 28.93 W
        Min                         : 28.25 W
        Avg                         : 28.45 W
    Clocks
        Graphics                    : 1518 MHz
        SM                          : 1518 MHz
        Memory                      : 4006 MHz
        Video                       : 1366 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1923 MHz
        SM                          : 1923 MHz
        Memory                      : 4004 MHz
        Video                       : 1708 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    SM Clock Samples
        Duration                    : 127.73 sec
        Number of Samples           : 100
        Max                         : 1835 MHz
        Min                         : 139 MHz
        Avg                         : 1264 MHz
    Memory Clock Samples
        Duration                    : 127.73 sec
        Number of Samples           : 100
        Max                         : 4006 MHz
        Min                         : 405 MHz
        Avg                         : 3045 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A

You are using gtp2ogs version 6.0
- For changelog or latest devel updates, please visit https://github.com/online-go/gtp2ogs/tree/devel

May 03 21:03:42   connection.js:1391             #  Connecting to https://online-go.com:443
May 03 21:03:42   connection.js:1391             #  Connected
May 03 21:03:42   connection.js:1391             #  Bot is username: meta-金毛测试-20b
May 03 21:03:42   connection.js:1391             #  Bot is user id: 592558
May 03 21:03:43   connection.js:1391             #  active_game: {"id":17708647,"phase":"play","name":"10000 vs 3200 test","player_to_move":592558,"width":19,"height":19,"move_number":2,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 21:03:43   game.js:490                    [Game 17708647]            Connecting to game.
May 03 21:03:43   game.js:490                    [Game 17708647]  Move 2    gamedata       B RoyalZero  [19x19]    
May 03 21:03:43   game.js:490                    [Game 17708647]  Move 2    makeMove 2 is 2 !== 2 ?
May 03 21:03:43   bot.js:214                     [27738] Starting  /home/amd2019/PhoenixGo/bazel-bin/mcts/mcts_main --gtp --config_path=/home/amd2019/PhoenixGo/etc/mcts_1gpu.conf --v=0
May 03 21:03:43   game.js:490                    [Game 17708647]  Move 2    Starting new bot process [27738]
May 03 21:03:43   game.js:490                    [Game 17708647]  Move 2    State loading for new bot
May 03 21:03:43   bot.js:214                     [27738] >>> boardsize 19
May 03 21:03:43   bot.js:214                     [27738] >>> clear_board
May 03 21:03:43   bot.js:214                     [27738] >>> komi 7.5
May 03 21:03:43   bot.js:214                     [27738] >>> play black q4
May 03 21:03:43   bot.js:214                     [27738] >>> play white d16
May 03 21:03:43   bot.js:214                     [27738] Generating move for game 17708647
May 03 21:03:43   game.js:490                    [Game 17708647]  Move 2    genmove black
May 03 21:03:43   bot.js:214                     [27738] >>> genmove black
May 03 21:03:43   game.js:490                    [Game 17708647]  Move 2    clock: {"game_id":17708647,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556910026765,"expiration":1556910326765,"black_time":{"thinking_time":300,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556910223155}
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.277033: I model/zero_model.cc:72] Read checkpoint state succ
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.297692: I model/zero_model.cc:80] Read meta graph succ
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.413360: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.413878: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7335
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: pciBusID: 0000:1c:00.0
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: totalMemory: 5.93GiB freeMemory: 3.67GiB
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.413893: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.515049: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.515081: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.515087: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.515286: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3035 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:1c:00.0, compute capability: 6.1)
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.516445: I model/zero_model.cc:100] Create session succ
May 03 21:03:43 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:43.575703: I model/zero_model.cc:107] Create graph succ
May 03 21:03:44 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:44.374243: I model/zero_model.cc:119] Load checkpoint succ
May 03 21:03:44 !!!!! bot.js:222                     [27738] stderr: 2019-05-03 21:03:44.677459: I external/org_tensorflow/tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
May 03 21:03:45   bot.js:214                     [27738] <<< =
May 03 21:03:45   bot.js:214                     [27738] <<< =
May 03 21:03:45   bot.js:214                     [27738] <<< =
May 03 21:03:45 !!!!! bot.js:222                     [27738] stderr: 1th move(b): pd, winrate=43.994141%, N=1, Q=-0.120117, p=0.077095, v=-0.120119, cost 145.624573ms, sims=32, height=2, avg_height=2.000000, global_step=639200
May 03 21:03:45   bot.js:214                     [27738] <<< =
May 03 21:03:46 !!!!! bot.js:222                     [27738] stderr: 2th move(w): dp, winrate=55.165863%, N=1, Q=0.103317, p=0.228309, v=0.103328, cost 152.800751ms, sims=29, height=2, avg_height=2.000000, global_step=639200
May 03 21:03:46   bot.js:214                     [27738] <<< =
May 03 21:04:26 !!!!! bot.js:222                     [27738] stderr: 3th move(b): cd, winrate=44.162975%, N=1939, Q=-0.116740, p=0.162921, v=-0.112885, cost 40846.269531ms, sims=10032, height=31, avg_height=5.446719, global_step=639200
May 03 21:04:26   bot.js:214                     [27738] <<< = C4
May 03 21:04:26   game.js:490                    [Game 17708647]  Move 2    Playing c4 { x: 2, y: 15, text: 'c4', resign: false, pass: false }
May 03 21:04:27   connection.js:1391             #  active_game: {"id":17708647,"phase":"play","name":"10000 vs 3200 test","player_to_move":484523,"width":19,"height":19,"move_number":3,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 21:04:27   game.js:490                    [Game 17708647]  Move 2    clock: {"game_id":17708647,"current_player":484523,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556910267062,"expiration":1556910567062,"black_time":{"thinking_time":119.703,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556910267063}
May 03 21:04:27   game.js:490                    [Game 17708647]  Move 2    game/17708647/move: { game_id: 17708647, move_number: 3, move: [ 2, 15, 240297 ] }
May 03 21:04:27   game.js:490                    [Game 17708647]  Move 3    Ignoring our own move 3
May 03 21:04:30   game.js:490                    [Game 17708647]  Move 3    clock: {"game_id":17708647,"current_player":592558,"black_player_id":592558,"white_player_id":484523,"title":"10000 vs 3200 test","last_move":1556910270097,"expiration":1556910389800,"black_time":{"thinking_time":119.703,"skip_bonus":false},"white_time":{"thinking_time":300,"skip_bonus":false},"now":1556910270097}
May 03 21:04:30   game.js:490                    [Game 17708647]  Move 3    game/17708647/move: { game_id: 17708647, move_number: 4, move: [ 15, 3, 3035 ] }
May 03 21:04:30   game.js:490                    [Game 17708647]  Move 4    Got     q16
May 03 21:04:30   bot.js:214                     [27738] Calling sendMove with q16
May 03 21:04:30   bot.js:214                     [27738] >>> play white q16
May 03 21:04:30   game.js:490                    [Game 17708647]  Move 4    makeMove 4 is 4 !== 4 ?
May 03 21:04:30   bot.js:214                     [27738] Generating move for game 17708647
May 03 21:04:30   game.js:490                    [Game 17708647]  Move 4    genmove black
May 03 21:04:30   bot.js:214                     [27738] >>> genmove black
May 03 21:04:30   connection.js:1391             #  active_game: {"id":17708647,"phase":"play","name":"10000 vs 3200 test","player_to_move":592558,"width":19,"height":19,"move_number":4,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":19,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"volatility":0.060090145513783985,"games_played":23}}},"white":{"username":"RoyalZero","id":484523,"rank":25,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":85.7766431653554,"rating":2818.0212415578553,"volatility":0.06260142409950183,"games_played":6201}}},"time_per_move":63}
May 03 21:04:30   connection.js:1391             #  Connected to game 17708647 already
May 03 21:04:30 !!!!! bot.js:222                     [27738] stderr: 4th move(w): pp, winrate=56.105717%, N=2127, Q=0.122114, p=0.679589, v=0.115394, cost 3286.168213ms, sims=816, height=31, avg_height=10.517672, global_step=639200
May 03 21:04:30   bot.js:214                     [27738] <<< =
roy7 commented 5 years ago

Those are PhoenixGo internal errors that I can see. Is there an issue with gtp2ogs you also saw?

wonderingabout commented 5 years ago

this is what i'm wondering as well

if its only a PhoenixGo specific issue, there's no reason to post it here, but since i dont know if its not gtp2ogs somehow triggering this, i'm posting it as a starter :+1:

wonderingabout commented 5 years ago

the issue does not happen anymore now though :thinking:

roy7 commented 5 years ago

The problem shown in this log https://github.com/Tencent/PhoenixGo/issues/93 is that the 'play' move was ignored and genmove happened first and tried to move where the other player already moved. I don't know if this is PhoenixGo internal issue (it did crash after moving) or if somehow gtp2ogs can be in a state where genmove is requested first? I'd need to double check code but I didn't think it'd be possible to issue these commands out of order.

roy7 commented 5 years ago

Ok so when we receive a move from the server this is the logic we process, when not in handicap placement time. This code is non-async. https://github.com/online-go/gtp2ogs/blob/2a567321a6f28f64f5f7cf296d9e628a0a8be35d/game.js#L207-L222

sendMove() sends the move to the engine. It's a very simple function, non-async. https://github.com/online-go/gtp2ogs/blob/2a567321a6f28f64f5f7cf296d9e628a0a8be35d/bot.js#L485-L488

They use the command() function which I also think is non-async unless I misunderstand. https://github.com/online-go/gtp2ogs/blob/2a567321a6f28f64f5f7cf296d9e628a0a8be35d/bot.js#L388-L426

Which means we can't be calling makeMove() before sendMove() and thus if sendMove() is in the log first, it had to have been sent to the engine first?

wonderingabout commented 5 years ago

just had the issue yesterday again as JBXLeelaBot made me notice :

https://online-go.com/game/17733237

too bad i didnt save command line log i'll be running with --v=1 from now on to debug this

since it is now the 3rd time, it will definitely happen again :+1:

erroragain$

wonderingabout commented 5 years ago

since i didnt notice this anymore, closing it for now

wonderingabout commented 5 years ago

got an error again (not sure if it's the same) :

game in : https://online-go.com/game/18783873

Jul 21 15:39:45   game.js:490                    [Game 18783873]            Connecting to game.
Jul 21 15:39:45   game.js:490                    [Game 18783873]            gamedata       B 00bSlowLeela  [19x19]    
Jul 21 15:39:45   game.js:490                    [Game 18783873]            makeMove 0 is 0 !== 0 ?
Jul 21 15:39:45   bot.js:214                     [4378] Starting  /home/amd2019/PhoenixGo/bazel-bin/mcts/mcts_main --gtp --config_path=/home/amd2019/PhoenixGo/etc/mcts_1gpu.conf
Jul 21 15:39:45   game.js:490                    [Game 18783873]            Starting new bot process [4378]
Jul 21 15:39:45   game.js:490                    [Game 18783873]            State loading for new bot
Jul 21 15:39:45   bot.js:214                     [4378] >>> boardsize 19
Jul 21 15:39:45   bot.js:214                     [4378] >>> clear_board
Jul 21 15:39:45   bot.js:214                     [4378] >>> komi 7.5
Jul 21 15:39:45   bot.js:214                     [4378] Generating move for game 18783873
Jul 21 15:39:45   game.js:490                    [Game 18783873]            genmove black
Jul 21 15:39:45   bot.js:214                     [4378] >>> genmove black
Jul 21 15:39:45   game.js:490                    [Game 18783873]            clock: {"game_id":18783873,"current_player":592558,"black_player_id":592558,"white_player_id":640435,"title":"lz elf-v1 -b 1 -t 2 -v 6400","last_move":1563716385067,"expiration":1563718185067,"black_time":{"thinking_time":1800,"skip_bonus":false},"white_time":{"thinking_time":1800,"skip_bonus":false},"start_mode":true,"now":1563716385177}
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.289472: I model/zero_model.cc:72] Read checkpoint state succ
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.308510: I model/zero_model.cc:80] Read meta graph succ
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.432449: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.433002: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7335
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: pciBusID: 0000:1c:00.0
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: totalMemory: 5.93GiB freeMemory: 3.09GiB
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.433015: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.529587: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.529619: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.529624: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.529809: I external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3035 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:1c:00.0, compute capability: 6.1)
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.531047: I model/zero_model.cc:100] Create session succ
Jul 21 15:39:45 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:45.582119: I model/zero_model.cc:107] Create graph succ
Jul 21 15:39:46   game.js:490                    [Game 18783873]            gamedata       B 00bSlowLeela  [19x19]    
Jul 21 15:39:46   game.js:490                    [Game 18783873]            makeMove 0 is 0 !== 0 ?
Jul 21 15:39:46   bot.js:214                     [4378] Generating move for game 18783873
Jul 21 15:39:46   game.js:490                    [Game 18783873]            genmove black
Jul 21 15:39:46   bot.js:214                     [4378] >>> genmove black
Jul 21 15:39:46   game.js:490                    [Game 18783873]            clock: {"game_id":18783873,"current_player":592558,"black_player_id":592558,"white_player_id":640435,"title":"lz elf-v1 -b 1 -t 2 -v 6400","last_move":1563716385067,"expiration":1563718185067,"black_time":{"thinking_time":1800,"skip_bonus":false},"white_time":{"thinking_time":1800,"skip_bonus":false},"start_mode":true,"now":1563716385994}
Jul 21 15:39:46 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:46.425765: I model/zero_model.cc:119] Load checkpoint succ
Jul 21 15:39:46 !!!!! bot.js:222                     [4378] stderr: 2019-07-21 15:39:46.756999: I external/org_tensorflow/tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
Jul 21 15:39:47   bot.js:214                     [4378] <<< =
Jul 21 15:39:47   bot.js:214                     [4378] <<< =
Jul 21 15:39:47   bot.js:214                     [4378] <<< =
Jul 21 15:40:41 !!!!! bot.js:222                     [4378] stderr: 1th move(b): pp, winrate=44.085880%, N=582, Q=-0.118282, p=0.080226, v=-0.116534, cost 53929.507812ms, sims=6408, height=10, avg_height=5.671315, global_step=639200
Jul 21 15:40:41   bot.js:214                     [4378] <<< = Q16
Jul 21 15:40:41   game.js:490                    [Game 18783873]            Playing q16 { x: 15, y: 3, text: 'q16', resign: false, pass: false }
Jul 21 15:40:41   connection.js:1391             #  active_game: {"id":18783873,"phase":"play","name":"lz elf-v1 -b 1 -t 2 -v 6400","player_to_move":640435,"width":19,"height":19,"move_number":1,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":36.80948311930709,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"games_played":23,"volatility":0.060090145513783985}}},"white":{"username":"00bSlowLeela","id":640435,"rank":34.950459271425665,"professional":false,"accepted":false,"ratings":{"overall":{"rating":2600.9994296562463,"deviation":145.1053194921326,"volatility":0.060348615827214624,"games_played":11}}},"time_per_move":80}
Jul 21 15:40:41   game.js:490                    [Game 18783873]            game/18783873/move: { game_id: 18783873, move_number: 1, move: [ 15, 3, 56804 ] }
Jul 21 15:40:41   game.js:490                    [Game 18783873]  Move 1    Ignoring our own move 1
Jul 21 15:40:41   game.js:490                    [Game 18783873]  Move 1    clock: {"game_id":18783873,"current_player":640435,"black_player_id":592558,"white_player_id":640435,"title":"lz elf-v1 -b 1 -t 2 -v 6400","last_move":1563716441804,"expiration":1563718241804,"black_time":{"thinking_time":1800,"skip_bonus":false},"white_time":{"thinking_time":1800,"skip_bonus":false},"now":1563716441805}
Jul 21 15:40:57   connection.js:1391             #  active_game: {"id":18783873,"phase":"play","name":"lz elf-v1 -b 1 -t 2 -v 6400","player_to_move":592558,"width":19,"height":19,"move_number":2,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":36.80948311930709,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"games_played":23,"volatility":0.060090145513783985}}},"white":{"username":"00bSlowLeela","id":640435,"rank":34.950459271425665,"professional":false,"accepted":false,"ratings":{"overall":{"rating":2600.9994296562463,"deviation":145.1053194921326,"volatility":0.060348615827214624,"games_played":11}}},"time_per_move":80}
Jul 21 15:40:57   connection.js:1391             #  Connected to game 18783873 already
Jul 21 15:40:57   game.js:490                    [Game 18783873]  Move 1    game/18783873/move: { game_id: 18783873, move_number: 2, move: [ 3, 15, 15253 ] }
Jul 21 15:40:57   game.js:490                    [Game 18783873]  Move 2    Got     d4
Jul 21 15:40:57   bot.js:214                     [4378] Calling sendMove with d4
Jul 21 15:40:57   bot.js:214                     [4378] >>> play white d4
Jul 21 15:40:57   game.js:490                    [Game 18783873]  Move 2    makeMove 2 is 2 !== 2 ?
Jul 21 15:40:57   bot.js:214                     [4378] Generating move for game 18783873
Jul 21 15:40:57   game.js:490                    [Game 18783873]  Move 2    genmove black
Jul 21 15:40:57   bot.js:214                     [4378] >>> genmove black
Jul 21 15:40:57   game.js:490                    [Game 18783873]  Move 2    clock: {"game_id":18783873,"current_player":592558,"black_player_id":592558,"white_player_id":640435,"title":"lz elf-v1 -b 1 -t 2 -v 6400","last_move":1563716457057,"expiration":1563718257057,"black_time":{"thinking_time":1800,"skip_bonus":false},"white_time":{"thinking_time":1800,"skip_bonus":false},"now":1563716457060}
Jul 21 15:41:33 !!!!! bot.js:222                     [4378] stderr: 3th move(b): dd, winrate=83.996918%, N=837, Q=0.679938, p=0.093162, v=0.627675, cost 52072.574219ms, sims=6408, height=9, avg_height=4.853092, global_step=639200
Jul 21 15:41:33   bot.js:214                     [4378] <<< = D4
Jul 21 15:41:33   game.js:490                    [Game 18783873]  Move 2    Playing d4 { x: 3, y: 15, text: 'd4', resign: false, pass: false }
Jul 21 15:41:33 !!!!! bot.js:222                     [4378] stderr: F0721 15:41:33.943889  4378 mcts_engine.cc:154] Check failed: ret == 0 (-1 vs. 0) Move: failed, dd, ret-1
Jul 21 15:41:33 !!!!! bot.js:222                     [4378] stderr: *** Check failure stack trace: ***
Jul 21 15:41:33 !!!!! bot.js:222                     [4378] stderr: *** Aborted at 1563716493 (unix time) try "date -d @1563716493" if you are using GNU date ***
Jul 21 15:41:33 !!!!! bot.js:222                     [4378] stderr: PC: @                0x0 (unknown)
Jul 21 15:41:34   bot.js:214                     [4378] Bot exited
Jul 21 15:47:15   connection.js:1391             #  Dumping status of all connected games
Jul 21 15:47:15   connection.js:1391             #  game_id=18783873: black=meta-金毛测试-20b white=00bSlowLeela bot_turn idle_time=378.871s bot.proc.pid=4378 bot.dead=true bot.failed=false
Jul 21 15:47:15   connection.js:1391             #  Dump complete
Jul 21 16:00:30   connection.js:1391             #  Not accepting new games (rejectnew).
Jul 21 16:00:30   connection.js:1391             #  Rejecting challenge from hhht (9d)  [19x19]  id = 18784119
Jul 21 16:00:30   connection.js:1401             POST online-go.com 443 /api/v1/me/challenges/5311122 { delete: true,
  message:
   'Currently, this bot is not accepting games, try again later ',
  apikey: ',
  bot_id: 592558,
  player_id: 592558,
  jwt:
   'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJwbGF5ZXJfaWQiOjU5MjU1OCwiYm90Ijp0cnVlLCJpYXQiOjE1NjM2OTA3MzZ9.YwQQOYSYG-e5roNvtWKmuPOzDby2jcJf1JbBFCYIwdM' }
Jul 21 16:02:15   connection.js:1391             #  Dumping status of all connected games
Jul 21 16:02:15   connection.js:1391             #  game_id=18783873: black=meta-金毛测试-20b white=00bSlowLeela bot_turn idle_time=1278.872s bot.proc.pid=4378 bot.dead=true bot.failed=false
Jul 21 16:02:15   connection.js:1391             #  Dump complete
Jul 21 16:10:57   connection.js:1391             #  active_game: {"id":18783873,"phase":"finished","name":"lz elf-v1 -b 1 -t 2 -v 6400","player_to_move":592558,"width":19,"height":19,"move_number":2,"paused":0,"private":false,"black":{"username":"meta-金毛测试-20b","id":592558,"rank":36.80948311930709,"professional":false,"accepted":false,"ratings":{"overall":{"deviation":95.45099520611241,"rating":2760.424660297423,"games_played":23,"volatility":0.060090145513783985}}},"white":{"username":"00bSlowLeela","id":640435,"rank":34.950459271425665,"professional":false,"accepted":false,"ratings":{"overall":{"rating":2600.9994296562463,"deviation":145.1053194921326,"volatility":0.060348615827214624,"games_played":11}}},"time_per_move":80}
Jul 21 16:10:57   connection.js:1391             #  18783873 active_game phase === finished
Jul 21 16:10:57   connection.js:174              Starting disconnect Timeout in Connection active_game for 18783873
Jul 21 16:10:57   game.js:490                    [Game 18783873]  Move 2    gamedata       B 00bSlowLeela  [19x19]    
Jul 21 16:10:57   game.js:490                    [Game 18783873]  Move 2    Game over.   Result: W+Time      L
Jul 21 16:10:57   bot.js:214                     [4378] Stopping bot
Jul 21 16:10:57   game.js:490                    [Game 18783873]  Move 2    clock: {"game_id":18783873,"current_player":592558,"black_player_id":592558,"white_player_id":640435,"title":"lz elf-v1 -b 1 -t 2 -v 6400","last_move":1563716457057,"expiration":1563718257057,"black_time":{"thinking_time":1800,"skip_bonus":false},"white_time":{"thinking_time":1800,"skip_bonus":false},"now":1563718257188}
Jul 21 16:10:58   connection.js:1391             #  disconnectFromGame 18783873
Jul 21 16:10:58   game.js:490                    [Game 18783873]  Move 2    Disconnecting from game.
Jul 21 16:11:02   bot.js:214                     [4378] Killing process directly with a signal
Jul 21 16:17:15   connection.js:1391             #  Dumping status of all connected games
Jul 21 16:17:15   connection.js:1391             #  Dump complete
roy7 commented 5 years ago

It looks like a the gamedata packet arrived a second time. And Phoenix took two moves in a row (genmove black was called twice). The 2nd genmove was before white had moved, white sent to D4, then Phoenix also wanted to go to D4. The crash maybe happened because 'play white d4' is an illegal move since it thinks black already put a second stone down on D4.

I guess the question is if getting two gamedata packets in a row would cause a second incorrect genmove to trigger? It might. We used to restart the bot any time we got an unexpected genmove which would have prevented this. But that led to restart loops because of other prior changes, and it was never fully debugged so instead we stopped restarting on every gamedata packet.

Perhaps we need to record if a bot is currently thinking about a more or not, and if so, be sure we don't ask for another makeMove().

wonderingabout commented 5 years ago

i understand the general idea :+1: so it would be a gtp2ogs configure issue then

i'll leave the rest to capable people lol thanks :1st_place_medal:

roy7 commented 5 years ago

Ok I found it.

https://github.com/online-go/gtp2ogs/blob/devel/game.js#L108-L114

Change the } else { to } else if (!this.processing) {.

That should fix it.