LC0 selfplay sometimes ends prematurely declaring a draw in selfplay when resign is off and there is no valid reason to claim a draw.

LeelaChessZero / lc0

The rewritten engine, originally for tensorflow. Now all other backends have been ported here.

GNU General Public License v3.0

2.4k stars 526 forks source link

LC0 selfplay sometimes ends prematurely declaring a draw in selfplay when resign is off and there is no valid reason to claim a draw. #798

Open trophymursky531 opened 5 years ago

trophymursky531 commented 5 years ago

In a selfplay Lc0 v0.21.0-rc2 game (specifically running ./lc0 selfplay --policy-softmax-temp=1.0 --visits=10 --player1.weights=weights_run1_40467.pb --player2.weights=weights_run2_50467.pb --no-share-trees --games=1 --verbose-move-stats>test40_vs_50_10nodes.txt

Because I accidentally kept temperature and noise on it's not very reproducible (I'm trying to see how often it reproduces) but the game ended as a draw where one side was clearly winning and there was no reason for a draw (no threefold or 50 move rule).

output file attached.

test40_vs_50_10nodes.txt

trophymursky531 commented 5 years ago

reproduced 3 times in 100 games with the command above, 26 of the games ended in draws That means around 12% of draws in that tournament were wrongfully considered draws

After looking I think there is a bug in /chess/position.cc as line 86 "if (Last().GetGamePly() >= 450) return GameResult::DRAW;" seems wrong to me as there is no 225 move rule in chess.

Tilps commented 5 years ago

This limit is hardcoded to selfplay because its part of the training scheme - but it really should be an option, since people use selfplay for non-training purposes.

trophymursky531 commented 5 years ago

What's the justification for it being in the training scheme? After a quick run it looks like I can reproduce it more at 800 visit searches so resign play through games will see some results that should not be draws. Training on Q and Z probably helps but a lot of the z's on resign playthroughs that are draws should probably be wins/losses.

killerducky commented 5 years ago

For the training case, it's a practical trade off of time vs accuracy. We want to just end games that go a long time, and we assume most games that reach play 450 are draws. Sometimes the assumption is wrong, but we accept this for the sake of speed.

Also remember during training games use TB rescoring, so games that reach TB will have their result corrected. So the logfile you attached will be rescored as a win for White.

About why 3/100 of your games reached the limit, my guess is it's related to Lc0's tendency to shuffle when the game is already won. It's a known issue but not top priority to fix for now.

Naphthalin commented 4 years ago

as Nibbler now supports selfplay games (@fohristiwhirl), this still might be relevant. However, is there any use case where a selfplay game should continue beyond 225 moves, especially now that we will traing MLH nets (and there really is no point in watching the endless shuffling of the nets playing themselves)?

mooskagh commented 4 years ago

So do I understand it right, that it's not possible to disable draw adjudication in selfplay? Then it's a bug worth fixing.

mooskagh commented 4 years ago

Ah it's not about adjudication but rather 450 ply.. Still may be good to fix, although not very high priority.

Naphthalin commented 1 year ago

We deactivated the 450 ply limit in self-play, didn't we?

KarlKfoury commented 4 weeks ago

it looks like the 450 move limit was removed. Are selfplay games still ending prematurely?

mooskagh commented 4 weeks ago

It's only for the selfplay, and it's still there: https://github.com/LeelaChessZero/lc0/blob/08b41c4e15c49838c41832d77ea1228fab82a6f0/src/selfplay/game.cc#L156