Open AntonioDiCecco opened 6 years ago
With 20% chance of winning, it will win 1 time out of 5 (by definition). Being 5 pawns down is much less than 20%, and there are discussions to limit it at 5%. There is a worry that that way engine will forget how to win won positions though.
On Tue, Apr 24, 2018 at 10:35 AM Sabaain notifications@github.com wrote:
To train faster I would propose a pseudo winning condition Instead of checkmate.
An old idea of Laker was that the game was lost with a five pawns difference. Of course a similar idea should have a new form in lc0 training.
My propose is to "resign" when in training one ID thinks to have less of 20% chances of winning and the other more than 80%
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/glinscott/leela-chess/issues/429, or mute the thread https://github.com/notifications/unsubscribe-auth/AKvpl3K9MJebXBbCLPR2JpI9tmIw32IOks5truPUgaJpZM4ThO7H .
My full Idea is using the 80% vs 20% (or 95% 5%) as a start and reducing it gradually to 100% vs 0% like in simulated annealing
Quite an interesting idea. It would be interesting to test the proxy (i.e. psuedo winning conditions - whatever that may be) vs. normal computation and calculate differences. That way, you can place a cost on computations - and if you can price it fairly accurately, vary the probabilities and see the results perhaps you can improve speed by an order of magnitude.
See also #418.
No need of human handcrafted (boo...) "pseudo winning condition". LCZ has already the capability to resign self-play games with the resign threshold feature inherited from LZ code (and from AlphagoZero). Resign based on win rate dropping below some threshold , threshold dynamically adjusted so that only 5% of game are unduly resigned). What is not clear to me is whether that feature is currently enabled in LCZ.
@Ishinoshita It is not. There has been some discussion of implementing the resign feature in training games in the next version.
To train faster I would propose a pseudo winning condition Instead of checkmate.
An old idea of Lasker was that the game was lost with a five pawns difference. Of course a similar idea should have a new form in lc0 training.
My propose is to "resign" when in training one ID thinks to have less of 20% chances of winning and the other more than 80%