Problem with evaluation after a long search?

LeelaChessZero / lc0

The rewritten engine, originally for tensorflow. Now all other backends have been ported here.

GNU General Public License v3.0

2.47k stars 534 forks source link

Problem with evaluation after a long search? #557

Closed pdsmike closed 4 years ago

pdsmike commented 5 years ago

I have tested the "blunder" position (move 35) from game Arasan - Lc0 ( http://legacy-tcec.chessdom.com/archive.php?se=14&di=3&ga=12 ). I did run the test with v0.18.1 and nets id11248 and id31777.

arasan-lc0_id11248_id31777_v0181_default

You can see, that the loosing line is found by id11248 (upper window) after 3,2 milion nodes, but the evaluation for it is -0.12 (evaluations are from position of color). Id31777 doesen't see any threats. After i make the move, evaluation changes instantly to +1.72, and the nodes count for that is 2,825k (taken from cache i can imagine). Now the position is loosing.

arasan-lc0_id11248_id31777_v0181_default_after_move

I tried another position that was possible in this game that leads to mate, and it took way too many nodes to find it. It took id11248 - 223k nodes (upper window), and id31777 - over 1,9 milion nodes! lc0_cant_find_winning_quick

But the most disturbing thing is the incorrect evaluation, that leads to picking up a bad move.

Videodr0me commented 5 years ago

This is a typical case of a wrong eval preventing search from exploring the critical line deeply enough. Basically the NN evals the positions after Rc8d8 Rf7+ as good for black. Then it basically stops exploring Rf7 for a long time (its stuck at about 12000 visits). There are 4 moves which leela (all with NN 11248) thinks are better for white (while eval still thinks white is worse). Just to illustrate how bad this effect is. At 215k nodes Rf7+ has 10880 visits. At 775k nodes it gets 11194 visits and at 1000k nodes its still at 11194 visits. Now later once it realized that Rf7+ is not as bad it picks up visits, unfortunately lc0 only sees that its just below a draw (-0.12) and spends visits on other moves deep in many of the c5 lines.

If you play Rf7+ and let lc0 think, there are no other moves to absorb visits, and it finds out that its winning pretty quickly. On the bright side the new c scaling taken from DM's latest A0 paper might improve this.

pdsmike commented 5 years ago

with v0.19.1rc2 default and --fpu-strategy=absolute Rc8d8 is abandoned after only 340k nodes in favor of Kg5, the blunder is not played, but the evaluation for this move is still not correct (-0.1). 2r1r3/2pR3R/1p3kpp/p3p3/1nP1PbP1/BP3B2/5P2/5K2 b - - 3 35

Naphthalin commented 4 years ago

Known (and expected) behavior, and different nets with different parameters have different blind spots. However, blunder rate has gone down significantly, so this issue can be closed.

mooskagh commented 4 years ago

It's an issue worth investigating, but we need something fresh. :-P