Closed Mardak closed 5 years ago
As at TCEC games are adjudicated with tablebases, as soon as Lc0 picks a tablebase move, game would be stopped. At TCEC that would surely help. Whether Lc0 would actually be able to convert won position (e.g. in upcoming CCCC), I'm not sure, tests are not very optimistic there.
I updated the original post with 10,000 visits and pgn to check lichess TB. Looking at positions where 10520 + tb would have not played the game ending move:
4-26.4 Tucano 7.05 vs Senpai 2.0
noTB a7a3 (1384) N: 2332 (+ 0) (P: 17.38%) (Q: 0.04444) (U: 0.02435) (Q+U: 0.06879) (V: 0.0602) (played)
with g6g7 (1349) N: 1104 (+ 0) (P: 5.98%) (Q: 0.03073) (U: 0.01742) (Q+U: 0.04815) (V: 0.0322)
with a7a3 (1384) N: 1179 (+ 0) (P: 17.38%) (Q: 0.00000) (U: 0.04742) (Q+U: 0.04742) (V: 0.0000) (T) (played)
with e4f4 (791 ) N: 2218 (+ 1) (P: 18.27%) (Q: 0.02090) (U: 0.02650) (Q+U: 0.04740) (V: 0.0533)
Q for e4f4 is close to draw, and TB says it’s a draw.
3-14.1 lc0 16.10520 vs Pedone 1.8
noTB c4a4 (718 ) N: 4519 (+ 1) (P: 24.90%) (Q: 0.72924) (U: 0.01428) (Q+U: 0.74352) (V: 0.6509) (played)
with c4a4 (718 ) N: 130 (+ 0) (P: 24.90%) (Q: 0.00000) (U: 0.58093) (Q+U: 0.58093) (V: 0.0000) (T) (played)
with c4c6 (732 ) N: 523 (+ 0) (P: 4.05%) (Q: 0.50413) (U: 0.02361) (Q+U: 0.52774) (V: 0.4846)
with c4c5 (727 ) N: 2138 (+ 0) (P: 7.30%) (Q: 0.51313) (U: 0.01043) (Q+U: 0.52356) (V: 0.6416)
with c4b4 (719 ) N: 4061 (+ 1) (P: 9.96%) (Q: 0.51709) (U: 0.00749) (Q+U: 0.52458) (V: 0.5727)
All these top 3 rook moves seem to have pretty good win rates, but 7-man TB says they’re all draws, so avoiding the capture and extending the game might lead to odd behavior given that the eval is so off.
3-15.2 Ethereal 10.81 vs DeusX 1.0
noTB c5a6 (713 ) N: 4734 (+ 1) (P: 37.44%) (Q: 0.26382) (U: 0.02634) (Q+U: 0.29016) (V: 0.4083) (played)
with c5a6 (713 ) N: 838 (+ 0) (P: 37.44%) (Q: 0.00000) (U: 0.12890) (Q+U: 0.12890) (V: 0.0000) (T) (played)
with c4b4 (965 ) N: 1113 (+ 0) (P: 14.12%) (Q: 0.09010) (U: 0.03661) (Q+U: 0.12670) (V: 0.2476)
with c4d4 (966 ) N: 3898 (+ 1) (P: 16.96%) (Q: 0.11659) (U: 0.01256) (Q+U: 0.12915) (V: 0.3195)
Moving the king instead of capturing with the knight is still a draw, and the Q is somewhat close to draw.
2-13.3 ChessBrainVB 3.70 vs Ethereal 10.85
noTB b3b4 (453 ) N: 4932 (+ 1) (P: 59.89%) (Q: 0.61589) (U: 0.02958) (Q+U: 0.64548) (V: 0.5077) (played)
with b3b4 (453 ) N: 1617 (+ 0) (P: 59.89%) (Q: 0.00000) (U: 0.11959) (Q+U: 0.11959) (V: 0.0000) (T) (played)
with c2b1 (246 ) N: 1658 (+ 0) (P: 8.51%) (Q: 0.10241) (U: 0.01658) (Q+U: 0.11899) (V: 0.1224)
with b3b2 (442 ) N: 1744 (+ 0) (P: 5.83%) (Q: 0.10757) (U: 0.01080) (Q+U: 0.11837) (V: 0.1233)
with d5d6 (1007) N: 2717 (+ 1) (P: 9.26%) (Q: 0.10803) (U: 0.01100) (Q+U: 0.11903) (V: 0.1230)
All moves here are draw.
2-14.1 Nirvana 2.4 vs Xiphos 0.3.14
noTB g3h3 (1346) N: 4410 (+ 1) (P: 39.75%) (Q: -0.80660) (U: 0.02468) (Q+U: -0.78192) (V: -0.8495) (played)
with g3h3 (1346) N: 796 (+ 0) (P: 39.75%) (Q: -1.00000) (U: 0.15669) (Q+U: -0.84331) (V: -1.0000) (T) (played)
with g3f2 (1348) N: 2772 (+ 0) (P: 19.13%) (Q: -0.87873) (U: 0.02167) (Q+U: -0.85706) (V: -0.8455)
with g3f3 (1345) N: 4233 (+ 1) (P: 23.11%) (Q: -0.87422) (U: 0.01715) (Q+U: -0.85707) (V: -0.8279)
Instead of capturing for a TB hit loss, it tries a different move, but all moves are losses anyway, so not changing the outcome here.
2-18.2 Texel 1.08a11 vs ChessBrainVB 3.70
noTB d5f3 (992 ) N: 4802 (+ 1) (P: 28.05%) (Q: 0.41888) (U: 0.01439) (Q+U: 0.43326) (V: 0.3196) (played)
with d5f3 (992 ) N: 988 (+ 0) (P: 28.05%) (Q: 0.00000) (U: 0.09577) (Q+U: 0.09577) (V: 0.0000) (T) (played)
with a5a4 (903 ) N: 994 (+ 0) (P: 5.15%) (Q: 0.07763) (U: 0.01748) (Q+U: 0.09511) (V: 0.0366)
with g4f5 (860 ) N: 1057 (+ 0) (P: 4.32%) (Q: 0.08078) (U: 0.01379) (Q+U: 0.09456) (V: 0.1099)
with a5c5 (907 ) N: 1198 (+ 1) (P: 7.57%) (Q: 0.07368) (U: 0.02130) (Q+U: 0.09498) (V: 0.0499)
No escaping the draw, but the other moves Q are pretty close to draw anyway.
2-21.1 Xiphos 0.3.14 vs Nirvana 2.4
noTB h3g4 (641 ) N: 4905 (+ 1) (P: 41.39%) (Q: 0.74764) (U: 0.02085) (Q+U: 0.76850) (V: 0.7442) (played)
with h3g4 (641 ) N: 287 (+ 0) (P: 41.39%) (Q: 0.00000) (U: 0.36710) (Q+U: 0.36710) (V: 0.0000) (T) (played)
with f2e2 (341 ) N: 374 (+ 0) (P: 7.04%) (Q: 0.31790) (U: 0.04793) (Q+U: 0.36583) (V: 0.5542)
with h3h4 (642 ) N: 4734 (+ 1) (P: 5.67%) (Q: 0.36445) (U: 0.00306) (Q+U: 0.36751) (V: 0.4725)
It thinks it has a decent chance to win by avoiding the TB draw, but those moves turns out to be draws anyway.
2-22.3 Gull 180521 vs Arasan TCEC13
noTB d7c6 (1473) N: 1026 (+ 0) (P: 23.30%) (Q: 0.27117) (U: 0.06501) (Q+U: 0.33617) (V: 0.2438) (played)
with d7c6 (1473) N: 172 (+ 0) (P: 23.30%) (Q: 0.00000) (U: 0.35011) (Q+U: 0.35011) (V: 0.0000) (T) (played)
with d7e6 (1475) N: 261 (+ 0) (P: 14.73%) (Q: 0.20114) (U: 0.14611) (Q+U: 0.34725) (V: 0.1982)
with d7c8 (1485) N: 503 (+ 0) (P: 15.34%) (Q: 0.26470) (U: 0.07910) (Q+U: 0.34380) (V: 0.2697)
with d7e8 (1487) N: 4662 (+ 1) (P: 30.04%) (Q: 0.32807) (U: 0.01674) (Q+U: 0.34481) (V: 0.1200)
Avoiding the capture for TB draw attempting to win but these are draws.
Overall, looks like when lc0 + TB plays away from the TB move in these cases, it would have drawn or lost anyway. However, there can be quite a difference in the play-away-from-TB-draw move win rate, so it’s giving up a capture to extend the game attempting a win, but these positions aren’t winnable, so potentially it opens up the opportunity to blunder especially that avoiding the move means the opponent has an extra piece now.
So guessing the behavior several moves ahead, these "avoid TB" leaf moves could mean search will be directed towards these positions that are drawn or lost instead of finding the better winning move. I guess we'll see in a bit if the 50-move training data fix will clean up these wrong evals or if something like #237 with a lower temperature will be needed.
Edit: To be clear, these positions are those that TCEC adjudicated with SyzygyTB, so most likely this means the two engines disagreed that it was a draw or a win/loss. There are plenty of positions where lc0 + TB would play towards the winning move, but perhaps those positions are those that both engines would agree that one side is winning.
@jjoshua2 Are these the expected behavior with 6-man TB? Here's the games with at most 100 moves that reached 6 pieces. I ran the position just before the 26th capture with 10,000 visits using 10520 to see what search thought of the played move with and without TB including the top 3 most visited moves.
In
16.3 Tucano 7.05 vs Ivanhoe 999946h
, 10520 thought the move heavily favored the opponent and avoids the move, but with TB, it would have played it to draw. Amusingly in23.2 LCZero 16.10161 vs Senpai 2.0
, lc0 thinks its played move is the least losing, but with TB, it would have felt better knowing it was a draw. And in26.4 Tucano 7.05 vs Senpai 2.0
for both with and without TB, the most visited move tries to win instead of drawing.In
13.4 Pedone 1.8 vs Arasan TCEC13
, lc0 thought it found an amazing winning move out of other losing moves, but TB would have clarified it was just a draw. In14.1 lc0 16.10520 vs Pedone 1.8
, lc0 itself normally would have played the draw TB move, but with TB, it would play a different move as it believes it's winning. Similarly15.2 Ethereal 10.81 vs DeusX 1.0
, the highest prior move is for the TB draw move, but with TB, it finds a different move to continue the game.Generally in most positions, 10520 would play the same move that ended the game, but even then, the Q could be quite different, and some positions lc0+tb would play a completely different move, although unclear if it would have changed the final outcome.