LeelaChessZero / lc0

The rewritten engine, originally for tensorflow. Now all other backends have been ported here.
GNU General Public License v3.0
2.48k stars 534 forks source link

Analyze blunders, part 2 #164

Closed mooskagh closed 4 years ago

mooskagh commented 6 years ago

(part 1 was here)

This is to gather fresh examples of blunders (./lc0, on nets trained in July 2018 or later)

Important!

When reporting positions to analyze, please use the following form. It makes it easier to see what's problematic with the position:

(old text below)

There are many reports on forums asking about blunders, and the answers so far had been something along the lines "it's fine, it will learn eventually, we don't know exactly why it happens".

I think at this point it makes sense to actually look into them to confirm that there no some blind spots in training. For that we need to:

Eventually all of this would be nice to have as a single command, but we can start manually.

For lc0, that can be done this way: --verbose-move-stats -t 1 --minibatch-size=1 --no-smart-pruning (unless you want to debug specifically with other settings).

Then run UCI interface, do command:

position startpos moves e2e4 ....

(PGN move to UCI notation can be converted using pgn-extract -Wuci)

Then do:

go nodes 10

see results, add some more nodes by running:

go nodes 20
go nodes 100
go nodes 800
go nodes 5000
go nodes 10000
and so on

And look how counters change.

Counters:

e2e4 N: 329 (+ 4) (P:38.12%) (Q: -0.2325) (U: 0.2394) (Q+U: 0.0069) (V: 0.0160) 
 ^      ^    ^      ^           ^          ^            ^            ^
 |      |    |      |           |          |            |            Expected outcome for this position, 
 |      |    |      |           |          |            |            directly from NN, -1..1
 |      |    |      |           |          |          Q+U, see below   
 |      |    |      |           |          U from PUCT formula,
 |      |    |      |           |          see below.
 |      |    |      |           Average value of V in a subtree
 |      |    |      Policy prior for this move, from NN, but if Dirichlet
 |      |    |      node is on, it's also added here, 0%..100%
 |      |   How many visits are processed by other threads when this is printed.
 |     Number of visits. The move with maximum visits is chosen for play.
Move

* U = P * Cpuct * sqrt(sum of N of all moves) / (N + 1)
  CPuct is a search parameter, can be changed with a command line flag.
* The move with largest Q+U will be visited next

Help wanted:

haleysa commented 6 years ago

Here's a game from the original thread that still applies today, as far as I understand it.

ID: QueenTrap Game: https://lichess.org/efi0R82j#40 Bad move: 21. Qxg7 (g3g7) - SF9 eval on Lichess goes from +0.3 to -5.7 - Leela just lost a queen for a rook and 2 pawns. Refutation: 21. .. Rg6 (e6g6) - SF9 eval -5.7; second best move is e4 (e5e4) eval +0.9. Correct move: Re3 and h4 are both good moves leaving the position at +0.3. Configuration: Original game: lczero v0.9 ID 271 Currently tested against lc0.exe lc0-win-20180711-cuda92-cudnn714 with test ID 10067 and ID 485

Comments: Leela has a few tactics that are harder than others, and a few technical aspects of positions can make UCT need more nodes to find the right move in certain cases (lots of potential moves, "refutation" is an "only move" where the eval changes greatly but only if you find the right refutation, etc.). This position combines a few of these and makes it very nasty for Leela to avoid the blunder.

Specifically, there are two potential ways to try to get the queen out after Rg6 that Leela at first thinks can work. There's 22. Qxh7, but that falls to 22. .. Rxg2+ and a discovered attack on the queen - a Leela weakness. There's also 22. Rxe5, which fails to 22. .. Rxg7 and white cannot take the black queen on f5 because that undefends black's mate threat (Rxe1#) -- setting up a recapture that doesn't work because of undefending a mate threat is another Leela tactical weakness.

ID485: Given the position after Qxg7, it has a very hard time finding the refutation. At go nodes 100000, it stops here:

info depth 4 seldepth 22 time 25573 nodes 54758 score cp -138 hashfull 248 nps 2141 pv e5e4 g7d7 h7h5 h2h4 b7b6 d3e4 d5e4 d7d4 f5g4 g2g3 b8b7 e2e3 g4f5 e1e2
info string e8h8  (103 ) N:       8 (+ 0) (P:  0.63%) (Q: -0.90013) (U: 0.55864) (Q+U: -0.34149) (V: -0.8641)
info string f5d3  (833 ) N:       8 (+ 0) (P:  0.71%) (Q: -0.91401) (U: 0.62326) (Q+U: -0.29075) (V: -0.9066)
info string f5f2  (839 ) N:       8 (+ 0) (P:  0.72%) (Q: -0.92058) (U: 0.63945) (Q+U: -0.28113) (V: -0.9237)
info string f5e4  (829 ) N:      10 (+ 0) (P:  0.83%) (Q: -0.91912) (U: 0.60155) (Q+U: -0.31757) (V: -0.9314)
info string e8g8  (102 ) N:      10 (+ 0) (P:  0.77%) (Q: -0.85824) (U: 0.55375) (Q+U: -0.30449) (V: -0.8424)
info string f5f7  (813 ) N:      10 (+ 0) (P:  0.87%) (Q: -0.93293) (U: 0.63007) (Q+U: -0.30286) (V: -0.9505)
info string e6h6  (548 ) N:      10 (+ 0) (P:  0.74%) (Q: -0.83100) (U: 0.53577) (Q+U: -0.29523) (V: -0.8802)
info string f5h3  (837 ) N:      10 (+ 0) (P:  0.84%) (Q: -0.88443) (U: 0.60417) (Q+U: -0.28025) (V: -0.9197)
info string f5f3  (835 ) N:      10 (+ 0) (P:  0.84%) (Q: -0.88704) (U: 0.60731) (Q+U: -0.27973) (V: -0.9309)
info string f5g4  (831 ) N:      11 (+ 0) (P:  0.93%) (Q: -0.93908) (U: 0.61512) (Q+U: -0.32396) (V: -0.9242)
info string f5g5  (826 ) N:      12 (+ 0) (P:  1.06%) (Q: -0.93925) (U: 0.64834) (Q+U: -0.29091) (V: -0.9350)
info string f5f8  (810 ) N:      18 (+ 0) (P:  0.83%) (Q: -0.64462) (U: 0.34707) (Q+U: -0.29754) (V: -0.3538)
info string e6d6  (545 ) N:      32 (+ 0) (P:  1.16%) (Q: -0.56372) (U: 0.28015) (Q+U: -0.28357) (V: -0.5490)
info string e8d8  (100 ) N:      34 (+ 0) (P:  1.34%) (Q: -0.59195) (U: 0.30425) (Q+U: -0.28770) (V: -0.5328)
info string e8c8  (99  ) N:      34 (+ 0) (P:  1.26%) (Q: -0.57447) (U: 0.28691) (Q+U: -0.28756) (V: -0.4978)
info string b8a8  (23  ) N:      37 (+ 0) (P:  0.97%) (Q: -0.48835) (U: 0.20258) (Q+U: -0.28577) (V: -0.3970)
info string f5f6  (818 ) N:      41 (+ 0) (P:  1.69%) (Q: -0.60927) (U: 0.32103) (Q+U: -0.28824) (V: -0.3617)
info string f5g6  (819 ) N:      44 (+ 0) (P:  1.61%) (Q: -0.56770) (U: 0.28439) (Q+U: -0.28331) (V: -0.2992)
info string b8a7  (30  ) N:      52 (+ 0) (P:  1.49%) (Q: -0.51040) (U: 0.22410) (Q+U: -0.28629) (V: -0.3719)
info string e6b6  (543 ) N:      66 (+ 0) (P:  1.80%) (Q: -0.50015) (U: 0.21384) (Q+U: -0.28631) (V: -0.5593)
info string e6c6  (544 ) N:      70 (+ 0) (P:  1.16%) (Q: -0.41705) (U: 0.13041) (Q+U: -0.28664) (V: -0.5058)
info string e6f6  (546 ) N:      76 (+ 0) (P:  1.11%) (Q: -0.39958) (U: 0.11489) (Q+U: -0.28468) (V: -0.3731)
info string e8f8  (101 ) N:     102 (+ 0) (P:  1.01%) (Q: -0.35908) (U: 0.07777) (Q+U: -0.28131) (V: -0.3605)
info string b7b5  (234 ) N:     111 (+ 0) (P:  1.97%) (Q: -0.42411) (U: 0.13982) (Q+U: -0.28428) (V: -0.3466)
info string d5d4  (761 ) N:     124 (+ 0) (P:  2.04%) (Q: -0.41484) (U: 0.12968) (Q+U: -0.28517) (V: -0.3185)
info string c7c5  (264 ) N:     136 (+ 0) (P:  2.57%) (Q: -0.43478) (U: 0.14947) (Q+U: -0.28531) (V: -0.2763)
info string e6g6  (547 ) N:     192 (+ 0) (P:  6.15%) (Q: -0.53725) (U: 0.25345) (Q+U: -0.28380) (V: -0.3435)
info string c7c6  (259 ) N:     193 (+ 0) (P:  2.87%) (Q: -0.40256) (U: 0.11765) (Q+U: -0.28490) (V: -0.3068)
info string a6a5  (425 ) N:     212 (+ 0) (P:  2.78%) (Q: -0.38876) (U: 0.10374) (Q+U: -0.28503) (V: -0.3065)
info string f5f4  (830 ) N:     214 (+ 0) (P:  1.69%) (Q: -0.34816) (U: 0.06250) (Q+U: -0.28567) (V: -0.3993)
info string b7b6  (230 ) N:     223 (+ 0) (P:  2.96%) (Q: -0.39055) (U: 0.10523) (Q+U: -0.28532) (V: -0.3035)
info string b8c8  (24  ) N:     367 (+ 0) (P:  4.36%) (Q: -0.37933) (U: 0.09426) (Q+U: -0.28507) (V: -0.2628)
info string h7h6  (400 ) N:     519 (+ 0) (P:  6.08%) (Q: -0.37837) (U: 0.09306) (Q+U: -0.28530) (V: -0.3446)
info string h7h5  (403 ) N:     557 (+ 0) (P: 10.51%) (Q: -0.43458) (U: 0.14987) (Q+U: -0.28472) (V: -0.2830)
info string f5h5  (827 ) N:     772 (+ 0) (P:  4.90%) (Q: -0.33454) (U: 0.05041) (Q+U: -0.28412) (V: -0.2358)
info string e8e7  (106 ) N:    1129 (+ 0) (P:  5.14%) (Q: -0.32378) (U: 0.03618) (Q+U: -0.28760) (V: -0.2906)
info string e6e7  (539 ) N:    1698 (+ 0) (P:  9.42%) (Q: -0.33019) (U: 0.04412) (Q+U: -0.28608) (V: -0.3310)
info string e5e4  (796 ) N:   47597 (+238) (P: 12.19%) (Q: -0.28812) (U: 0.00203) (Q+U: -0.28609) (V: -0.2730)
bestmove e5e4

It still has e6g6 buried pretty deep, and thus it's never avoiding this trap.

ID 10067: Slightly better. At ~96000 nodes, it sees that e6g6 is winning:

info depth 4 seldepth 22 time 25573 nodes 54758 score cp -138 hashfull 248 nps 2141 pv e5e4 g7d7 h7h5 h2h4 b7b6 d3e4 d5e4 d7d4 f5g4 g2g3 b8b7 e2e3 g4f5 e1e2
info string e8h8  (103 ) N:       8 (+ 0) (P:  0.63%) (Q: -0.90013) (U: 0.55864) (Q+U: -0.34149) (V: -0.8641)
info string f5d3  (833 ) N:       8 (+ 0) (P:  0.71%) (Q: -0.91401) (U: 0.62326) (Q+U: -0.29075) (V: -0.9066)
info string f5f2  (839 ) N:       8 (+ 0) (P:  0.72%) (Q: -0.92058) (U: 0.63945) (Q+U: -0.28113) (V: -0.9237)
info string f5e4  (829 ) N:      10 (+ 0) (P:  0.83%) (Q: -0.91912) (U: 0.60155) (Q+U: -0.31757) (V: -0.9314)
info string e8g8  (102 ) N:      10 (+ 0) (P:  0.77%) (Q: -0.85824) (U: 0.55375) (Q+U: -0.30449) (V: -0.8424)
info string f5f7  (813 ) N:      10 (+ 0) (P:  0.87%) (Q: -0.93293) (U: 0.63007) (Q+U: -0.30286) (V: -0.9505)
info string e6h6  (548 ) N:      10 (+ 0) (P:  0.74%) (Q: -0.83100) (U: 0.53577) (Q+U: -0.29523) (V: -0.8802)
info string f5h3  (837 ) N:      10 (+ 0) (P:  0.84%) (Q: -0.88443) (U: 0.60417) (Q+U: -0.28025) (V: -0.9197)
info string f5f3  (835 ) N:      10 (+ 0) (P:  0.84%) (Q: -0.88704) (U: 0.60731) (Q+U: -0.27973) (V: -0.9309)
info string f5g4  (831 ) N:      11 (+ 0) (P:  0.93%) (Q: -0.93908) (U: 0.61512) (Q+U: -0.32396) (V: -0.9242)
info string f5g5  (826 ) N:      12 (+ 0) (P:  1.06%) (Q: -0.93925) (U: 0.64834) (Q+U: -0.29091) (V: -0.9350)
info string f5f8  (810 ) N:      18 (+ 0) (P:  0.83%) (Q: -0.64462) (U: 0.34707) (Q+U: -0.29754) (V: -0.3538)
info string e6d6  (545 ) N:      32 (+ 0) (P:  1.16%) (Q: -0.56372) (U: 0.28015) (Q+U: -0.28357) (V: -0.5490)
info string e8d8  (100 ) N:      34 (+ 0) (P:  1.34%) (Q: -0.59195) (U: 0.30425) (Q+U: -0.28770) (V: -0.5328)
info string e8c8  (99  ) N:      34 (+ 0) (P:  1.26%) (Q: -0.57447) (U: 0.28691) (Q+U: -0.28756) (V: -0.4978)
info string b8a8  (23  ) N:      37 (+ 0) (P:  0.97%) (Q: -0.48835) (U: 0.20258) (Q+U: -0.28577) (V: -0.3970)
info string f5f6  (818 ) N:      41 (+ 0) (P:  1.69%) (Q: -0.60927) (U: 0.32103) (Q+U: -0.28824) (V: -0.3617)
info string f5g6  (819 ) N:      44 (+ 0) (P:  1.61%) (Q: -0.56770) (U: 0.28439) (Q+U: -0.28331) (V: -0.2992)
info string b8a7  (30  ) N:      52 (+ 0) (P:  1.49%) (Q: -0.51040) (U: 0.22410) (Q+U: -0.28629) (V: -0.3719)
info string e6b6  (543 ) N:      66 (+ 0) (P:  1.80%) (Q: -0.50015) (U: 0.21384) (Q+U: -0.28631) (V: -0.5593)
info string e6c6  (544 ) N:      70 (+ 0) (P:  1.16%) (Q: -0.41705) (U: 0.13041) (Q+U: -0.28664) (V: -0.5058)
info string e6f6  (546 ) N:      76 (+ 0) (P:  1.11%) (Q: -0.39958) (U: 0.11489) (Q+U: -0.28468) (V: -0.3731)
info string e8f8  (101 ) N:     102 (+ 0) (P:  1.01%) (Q: -0.35908) (U: 0.07777) (Q+U: -0.28131) (V: -0.3605)
info string b7b5  (234 ) N:     111 (+ 0) (P:  1.97%) (Q: -0.42411) (U: 0.13982) (Q+U: -0.28428) (V: -0.3466)
info string d5d4  (761 ) N:     124 (+ 0) (P:  2.04%) (Q: -0.41484) (U: 0.12968) (Q+U: -0.28517) (V: -0.3185)
info string c7c5  (264 ) N:     136 (+ 0) (P:  2.57%) (Q: -0.43478) (U: 0.14947) (Q+U: -0.28531) (V: -0.2763)
info string e6g6  (547 ) N:     192 (+ 0) (P:  6.15%) (Q: -0.53725) (U: 0.25345) (Q+U: -0.28380) (V: -0.3435)
info string c7c6  (259 ) N:     193 (+ 0) (P:  2.87%) (Q: -0.40256) (U: 0.11765) (Q+U: -0.28490) (V: -0.3068)
info string a6a5  (425 ) N:     212 (+ 0) (P:  2.78%) (Q: -0.38876) (U: 0.10374) (Q+U: -0.28503) (V: -0.3065)
info string f5f4  (830 ) N:     214 (+ 0) (P:  1.69%) (Q: -0.34816) (U: 0.06250) (Q+U: -0.28567) (V: -0.3993)
info string b7b6  (230 ) N:     223 (+ 0) (P:  2.96%) (Q: -0.39055) (U: 0.10523) (Q+U: -0.28532) (V: -0.3035)
info string b8c8  (24  ) N:     367 (+ 0) (P:  4.36%) (Q: -0.37933) (U: 0.09426) (Q+U: -0.28507) (V: -0.2628)
info string h7h6  (400 ) N:     519 (+ 0) (P:  6.08%) (Q: -0.37837) (U: 0.09306) (Q+U: -0.28530) (V: -0.3446)
info string h7h5  (403 ) N:     557 (+ 0) (P: 10.51%) (Q: -0.43458) (U: 0.14987) (Q+U: -0.28472) (V: -0.2830)
info string f5h5  (827 ) N:     772 (+ 0) (P:  4.90%) (Q: -0.33454) (U: 0.05041) (Q+U: -0.28412) (V: -0.2358)
info string e8e7  (106 ) N:    1129 (+ 0) (P:  5.14%) (Q: -0.32378) (U: 0.03618) (Q+U: -0.28760) (V: -0.2906)
info string e6e7  (539 ) N:    1698 (+ 0) (P:  9.42%) (Q: -0.33019) (U: 0.04412) (Q+U: -0.28608) (V: -0.3310)
info string e5e4  (796 ) N:   47597 (+238) (P: 12.19%) (Q: -0.28812) (U: 0.00203) (Q+U: -0.28609) (V: -0.2730)
bestmove e5e4

It still takes 192K nodes from the blunder position to avoid g3g7

info depth 4 seldepth 24 time 133477 nodes 182710 score cp 33 hashfull 726 nps 1368 pv g3g7 e5e4 d3e4 d5e4 e2e3 e6g6 g7d4 g6d6 d4b4 d6e6 b4c4 h7h5 c4e2 h5h4 h2h3 f5f4
info depth 4 seldepth 24 time 138569 nodes 191202 score cp 33 hashfull 752 nps 1379 pv g3g7 e5e4 d3e4 d5e4 e2e3 e6g6 g7d4 g6d6 d4b4 d6e6 b4c4 h7h5 c4e2 h5h4 h2h3 f5f4
info depth 4 seldepth 24 time 139703 nodes 192860 score cp 34 hashfull 758 nps 1380 pv h2h3 g7g5 f2f3 e6e7 e2e3 f5f6 c2c3 h7h5 d3d4 e5e4 f3e4 d5e4 e1f1 f6g6

I find it interesting that the test nets have similar tactical weaknesses to the original net - which suggests they are just "hard" for the NN to learn? Perhaps starting at a larger net will help with that, perhaps certain positions just require a lot of nodes to overcome tactical weaknesses. I don't think there's a "bug" here to fix, but it's an instructive position.

dubslow commented 6 years ago

I'm hoping that upping the training cpuct from 1.2 will help prevent this. These searches used the default lc0 search parameters, right?

mooskagh commented 6 years ago

I read that it can be reproducible on all networks, including main nets which are trained with "high" cpuct, so there is no evidence that different cpuct would help. I don't object changing cpuct for training runs, but don't expect from it to change things much.

dubslow commented 6 years ago

including main nets which are trained with "high" cpuct

there are no such nets which are as strong as recent main or test nets

haleysa commented 6 years ago

This was all done with default search parameters. And yes, it's been a problem for hundreds of networks, including several of the test nets (I haven't tested them all). There are a few simpler positions from the original thread that showcase individual tactical weaknesses. They have shown improvement since about ID 450? Instead of the right move, or the refutation move, being policy 0.2% or worse, they're up over 1% and thus UCT can find them in 1-2k nodes, which is reasonable. But with a complex situation like this one, it has to overcome a bad policy down several lines, and the difference in eval is high so it takes a lot of nodes to move the eval enough even after hitting on the right moves. I'm happy to rerun some results using a non-default cpuct if it's thought to be useful information, but I think overall the position is exposing some positions that leela's current structure is tactically weak at and UCT is weaker at finding. Maybe different training parameters would result in a net that better handles these tactics, maybe a larger network size could encapsulate the tactics required better, or maybe it's just tactics that don't get learned until further in the learning process.

DidzisC commented 6 years ago

Got a blunder in a test game today. Time control was 30 moves in 60 minutes: lc0_0712_ispc.exe vs Stockfish 8. (on rather slow and old pc). Main network 483. --cpuct=1.2 --fpu-reduction=.2 --policy-softmax-temp=1 for the search to have the same parameters with lczero and --max-prefetch=0 --minibatch-size=4 for the blas backend

https://lichess.org/MoD5vHSJ%22%5D

  1. Qb4 was really horrible. Real ?? move . For some reason, lc0 was expecting c4 as a response. Got c3 and took it, and the game was over.
GeorgeMJ23 commented 6 years ago

Test10 ID 10104 in the following game it played 48...h5?? missing the tactic Rf8+ Kh7 Qc2+ Be4 Qxe4 Rxe4(removing the pin, this is one of the 2 the classic themes of Leela's tactical blunders) gxh3.

[Event "Tour12 10104"]
[Site "Terminator PC"]
[Date "2018.07.21"]
[Round "20"]
[White "Wasp 3.0"]
[Black "Lc0 Test10 10104"]
[Result "1-0"]
[ECO "C05"]
[WhiteElo "2200"]
[BlackElo "2200"]
[WhiteType "program"]
[BlackType "program"]
[Opening "French"]
[Variation "Tarrasch, Closed, Nunn-Korchnoi Gambit, 4.e5 Nfd7 5.Bd3 c5 6.c3 Nc6 7.Ngf3 Qb6 8.O-O"]
[Time "11:03:20"]
[TimeControl "40/120:40/120:40/120"]
[Termination "normal"]
[PlyCount "108"]

1. e4 e6 2. d4 d5 3. Nd2 Nf6 4. e5 Nfd7 5. c3 c5 6. Bd3 Nc6 7. Ngf3 {+0.07/16 3}
7... Qb6 {+0.31/2 5} 8. O-O {-0.10/18 3} 8... cxd4 {+0.22/2 3} 9. cxd4 {0.00/20
3} 9... Nxd4 {+0.16/2 4} 10. Nxd4 {0.00/20 3} 10... Qxd4 {+0.15/2 1} 11. Nf3
{0.00/19 3} 11... Qb6 {+0.13/2 1} 12. Qa4 {0.00/18 3} 12... Qb4 {-0.02/2 5} 13.
Qc2 {-0.03/18 3} 13... Qc5 {-0.06/2 7} 14. Qb1 {+0.02/18 4} 14... Qc7 {+0.10/2
4} 15. Bf4 {+0.12/17 3} 15... h6 {+0.15/2 2} 16. Rc1 {-0.14/16 3} 16... Qd8
{+0.16/2 2} 17. Be3 {-0.02/15 3} 17... Be7 {+0.29/2 3} 18. Qc2 {-0.22/15 3}
18... O-O {+0.30/2 2} 19. Qa4 {0.00/17 3} 19... f6 {+0.47/2 3} 20. exf6
{-0.54/16 3} 20... Nxf6 {+0.40/2 3} 21. Ne5 {0.00/16 3} 21... Bd6 {+0.39/2 2}
22. f4 {0.00/16 3} 22... Bd7 {+0.93/2 3} 23. Qd1 {-0.33/17 3} 23... Be8 {+1.12/2
3} 24. Qe1 {-0.44/17 3} 24... Bxe5 {+1.44/2 3} 25. fxe5 {-0.43/17 1} 25... Ne4
{+1.52/2 2} 26. Qb4 {-0.34/18 3} 26... Bc6 {+1.93/2 3} 27. Rf1 {-0.42/18 3}
27... Qh4 {+1.89/2 5} 28. Rxf8+ {-0.24/19 4} 28... Rxf8 {+1.63/2 3} 29. Bxa7
{-0.25/18 3} 29... Ra8 {+1.45/2 6} 30. Be3 {0.00/17 3} 30... Ra4 {+1.23/2 3} 31.
Qb3 {+0.14/18 3} 31... Qh5 {+1.34/2 3} 32. Rc1 {0.00/18 3} 32... Ra5 {+1.16/2 1}
33. Bxe4 {0.00/18 3} 33... Rb5 {+1.01/2 2} 34. Bh7+ {+0.11/19 3} 34... Kxh7
{+0.89/2 1} 35. Qc3 {+0.14/19 3} 35... d4 {+0.83/2 3} 36. Bxd4 {+0.52/18 3}
36... Kg8 {+0.72/2 5} 37. Qc2 {+0.58/17 3} 37... Ra5 {+1.22/2 3} 38. a3
{+0.59/19 3} 38... Ra4 {+1.00/2 3} 39. Rd1 {+0.69/18 3} 39... Rc4 {+0.84/2 2}
40. Bc3 {+0.37/17 3} 40... Rg4 {+0.79/2 0} 41. Rd8+ {+0.53/17 2} 41... Kf7
{+0.73/2 2} 42. Rd2 {+0.53/17 2} 42... Kg8 {+0.76/2 7} 43. Rf2 {+0.60/17 2}
43... Be4 {+0.63/2 7} 44. Qe2 {+0.72/18 2} 44... Bc6 {+0.58/2 7} 45. Qd3
{+0.76/18 2} 45... Be4 {+0.61/2 4} 46. Qd2 {+0.68/18 3} 46... Qh3 {+0.68/2 4}
47. Qe2 {+0.84/17 2} 47... Bd5 {+0.67/2 5} 48. Bb4 {+0.81/19 2} 48... h5
{+0.76/2 5} 49. Rf8+ {+4.27/22 2} 49... Kh7 {+1.11/2 0} 50. Qc2+ {+4.40/23 2}
50... Be4 {-12.35/2 5} 51. Qxe4+ {+4.40/21 1} 51... Rxe4 {-16.21/2 3} 52. gxh3
{+4.51/21 1} 52... Rxe5 {-18.24/3 4} 53. Bc3 {+4.60/21 2} 53... Rg5+ {-19.99/2
5} 54. Kh1 {+4.69/22 2} 54... b5 {-18.33/2 5 Black resigns} 1-0

Hardware for Leela was GTX 1070 Ti, Lc0(1 July) was used, test10 ID 10104 and time control 40/2 repeating.

Leela was just unlucky as the following analysis of the position(of the FEN, not PGN, but with PGN analysis it also avoid h5 in exactly 8 seconds too) shows.
In the game it played after 5 seconds while to avoid h5 it needs 8 seconds on this hardware. Yet, that EXTREMELY simple tactic for 2018 engines should be solvable in less than 0.1 second.

Lc0 Test10 10104:

 1/10   00:00    2,156  2,949   -1,01   h6-h5 Bb4-c3 Rg4-g6 Qe2-c2 Rg6-g4 Qc2-e2 Rg4-g6
 1/11   00:01    5,138  3,450   -0,90   h6-h5 Bb4-d2 Bd5-c6 Bd2-c3 Bc6-d5 Qe2-d2 b7-b5 Qd2-e2 Rg4-g6
 1/12   00:01    6,288  3,554   -0,84   h6-h5 Bb4-d2 b7-b5 Bd2-b4 Rg4-g6 Qe2-c2 Kg8-h7 Rf2-e2 Qh3-g4
 1/13   00:03    11,873 3,731   -0,83   h6-h5 Bb4-d2 Bd5-c6 Bd2-c3 Bc6-d5 Qe2-d2 b7-b5 Qd2-e2 Rg4-g6 Qe2xb5
 1/17   00:05    22,468 3,820   -0,65   h6-h5 Bb4-d2 Bd5-c6 Bd2-c3 Bc6-d5 Qe2-d2 b7-b5 Qd2-e2 Rg4-g6 Qe2xb5
 2/17   00:08    35,046 3,981   -0,65   h6-h5 Bb4-d2 Bd5-c6 Bd2-c3 Bc6-d5 Qe2-d2 b7-b5 Qd2-e2 Rg4-g6 Qe2xb5
 2/17   00:09    38,145 4,008   -0,72   Kg8-h8 Bb4-d2 Kh8-g8 Bd2-b4 Rg4-g6 Qe2-c2 Rg6-g4 Rf2-e2 Bd5-e4
 2/18   00:10    43,686 4,125   -0,69   Kg8-h8 Bb4-c3 Kh8-h7 Qe2-c2+ Kh7-g8 Qc2-e2 Rg4-g5 a3-a4 b7-b6 Bc3-d2
 2/19   00:13    55,553 4,131   -0,49   Kg8-h8 Bb4-c3 Kh8-h7 Qe2-c2+ Kh7-g8 Qc2-e2 Rg4-g5 a3-a4 b7-b6 Bc3-d2
MartinRe63 commented 6 years ago

Net-520-20180727 https://lichess.org/ol6H1KUy#52 Leela didn't found Bf2 (Stockfish at depth 33) go nodes 10 -> d1d2 (Stockfish ~ -5) go nodes 10000 -> g1h2 (Stockfish ~ -1) g1f2 is already in the list of possible moves and calculated with a similar q value than g1h2.

Blunder? Assumption: Once more learning is done lc0 will learn this better.

frpays commented 6 years ago

ID TCEC-13.23.2.1 Network ID 10161 Lc0 16 Nodes: 539702 Move time: 41s

During TCEC. Leela versus Senpai.

It's not a blunder but a spike of evaluation from 0.81 to 4.71 then back to 0.33.

  1. d4 Nf6 2. c4 e6 3. Nf3 Be7 4. Qc2 d5 5. cxd5 exd5 6. Bg5 Bg4 7. Nc3 Nbd7 8. e3 O-O 9. Bd3 h6 10. Bh4 c5 11. dxc5 Bxc5 12. Rd1 Qa5 13. O-O d4 14. exd4 Bxf3 15. gxf3 Bxd4 16. Bf5 Bxc3 17. bxc3 Qb5 18. Kg2 Rae8 19. Rd6 Qc4 20. Bg3 Ne5 21. Rfd1 Qc5 22. Bf4 Nh5 23. Bg3 Nf6 24. R1d4 b5 25. Bd3 a5 26. Be2 b4 27. Qf5 bxc3 28. Bd1 Ned7 29. Qxc5 Nxc5 30. Rc4 Ne6 31. Rxc3 Nh5 32. Kf1 Nhf4 33. Bb3 Rc8 34. Rxc8 Rxc8 35. Ke1 Rc5 36. Kd2 g6 37. a4 Rf5 38. Rb6 Nd4 39. Ke3 Nxb3 40. Rxb3 Ng2+ 41. Ke4 Rh5 42. Kd4 Rf5 43. Kc4 *

The Eval spiked at move 27. Qf5.

https://pasteboard.co/HydTAQf.png

You can try to reproduce with:

./lc0 --nncache=2000000 --verbose-move-stats

position fen 4rrk1/pp3pp1/3R1n1p/2q1nB2/5B2/2P2P2/P1Q2PKP/3R4 b - - 10 22 moves f6h5 f4g3 h5f6 d1d4 b7b5 f5d3 a7a5 d3e2 b5b4 go nodes 540000

frpays commented 6 years ago

ID TCEC-13.23.2.2 Network ID 10161 Lc0 16 Rxc3 {d=6, sd=28, mt=24878, tl=144133, s=59684, n=1484841, pv=Rxc3 Kxh6 Kg3 Kg5 Rc5 Kf6 Kf4 Kg6 Rc8 Kg7 Rc6 Kf7 Rh6 Kg7 Rh5 Kg6 Rh4 Kf6 Rh8 Kg6 Rf8 h2 Rh8 Kf6 Rxh2 Ke6, tb=0, h=100.0, ph=0.0, wv=6.34, R50=50, Rd=-11, Rr=-9, mb=-3+0+1+1+0,}

Bad Move: 79. Rxc3 Correct move: anything that does not exchange something. Moving the bishop somewhere out of reach of both K and N.

Event "TCEC Season 13 - Division 4"] [Site "http://tcec.chessdom.com"] [Date "2018.08.08"] [Round "23.2"] [White "LCZero 16.10161"] [Black "Senpai 2.0"] [Result "1/2-1/2"] [BlackElo "3062"] [ECO "E10"] [GameDuration "01:22:41"] [GameEndTime "2018-08-08T14:23:19.344 W. Europe Standard Time"] [GameStartTime "2018-08-08T13:00:37.643 W. Europe Standard Time"] [Opening "Queen's pawn game"] [PlyCount "158"] [Termination "adjudication"] [TerminationDetails "SyzygyTB"] [TimeControl "1800+10"] [WhiteElo "3219"]

  1. d4 Nf6 2. c4 e6 3. Nf3 Be7 4. Qc2 d5 5. cxd5 exd5 6. Bg5 Bg4 7. Nc3 Nbd7 8. e3 O-O 9. Bd3 h6 10. Bh4 c5 11. dxc5 Bxc5 12. Rd1 Qa5 13. O-O d4 14. exd4 Bxf3 15. gxf3 Bxd4 16. Bf5 Bxc3 17. bxc3 Qb5 18. Kg2 Rae8 19. Rd6 Qc4 20. Bg3 Ne5 21. Rfd1 Qc5 22. Bf4 Nh5 23. Bg3 Nf6 24. R1d4 b5 25. Bd3 a5 26. Be2 b4 27. Qf5 bxc3 28. Bd1 Ned7 29. Qxc5 Nxc5 30. Rc4 Ne6 31. Rxc3 Nh5 32. Kf1 Nhf4 33. Bb3 Rc8 34. Rxc8 Rxc8 35. Ke1 Rc5 36. Kd2 g6 37. a4 Rf5 38. Rb6 Nd4 39. Ke3 Nxb3 40. Rxb3 Ng2+ 41. Ke4 Rh5 42. Kd4 Rf5 43. Kc4 h5 44. Bc7 Kh7 45. f4 Nxf4 46. Rb5 Rf6 47. Bxa5 Ne2 48. Bb6 Rf4+ 49. Kb3 Nc1+ 50. Ka3 Rf3+ 51. Kb2 Nd3+ 52. Kc2 Nxf2 53. a5 Ne4 54. Bd4 Rh3 55. Rb8 g5 56. a6 Rxh2+ 57. Kd3 Ra2 58. a7 f5 59. a8=Q Rxa8 60. Rxa8 h4 61. Ke3 Kg6 62. Rf8 Ng3 63. Kf3 Ne4 64. Be5 Nd2+ 65. Kg2 Ne4 66. Bc7 g4 67. Bf4 Nc5 68. Rh8 h3+ 69. Kh2 Kg7 70. Rb8 Ne4 71. Rb6 Kh7 72. Bc1 Nf2 73. Ra6 Ne4 74. Bh6 Nc5 75. Rf6 Ne4 76. Ra6 Nc5 77. Rf6 Ne4 78. Rc6 Nc3 79. Rxc3 Kxh6 1/2-1/2

(Tablebase draw)

MartinRe63 commented 6 years ago

ID: TCEC - Season 13 - Divison 3 - Game 19.1

Black: LC0 0.16.1 (TCEC version) NET 10520

Move 29 ..Bg4 is wrong

NET 10776 with release 16.0 is calculating Bc8 after 3.000.000 moves. Why did version 16.1 - Net 10520 not found this solution?

  1. e4 e5 2. Nf3 Nf6 3. d4 Nxe4 4. Bd3 d5 5. Nxe5 Nd7 6. Nxd7 Bxd7 7. O-O Qf6 8. Bxe4 dxe4 9. Nc3 Qg6 10. Nxe4 h5 11. Bf4 O-O-O 12. Qd3 h4 13. f3 Qf5 14. Be3 Qg6 15. d5 h3 16. g3 Re8 17. Rae1 Rh5 18. c4 Rhe5 19. Bd4 R5e7 20. Qa3 a6 21. Qa5 Bf5 22. Bc5 Re5 23. Bxf8 Rxf8 24. Nc3 Rfe8 25. Rxe5 Rxe5 26. Qc5 Qf6 27. Rd1 b6 28. Qf8+ Kb7 29. c5 Bg4 30. c6+ Ka7 31. Rd3 Rxd5 32. Nxd5 Qxb2 33. Ne3 Qc1+ 34. Nd1 Qc2 35. Nf2 Be6 36. Qxg7 Qc1+ 37. Rd1 Qxc6 38. Qf6 Qc4 39. Qe5 a5 40. a3 Ka6 41. g4 Qb3 42. Qe4 Qxa3 43. Qa8+ Kb5 44. Rb1+ Bb3 45. Qd5+ 1-0
GeorgeMJ23 commented 6 years ago

In the final position of the following PGN Leela as black has just lost a Knight. And the position is dead lost for black. So she should give a big positive score(test10 nets do that, for example 11089 gives +5.00 scores). Yet, she doesn't seem to care and gives d5 with only a tiny plus score(+0.60) for white like it's ok.

Leela = Lc0v17 cuda default settings, 20230 net, with GTX 1070 Ti in infinite analysis mode.

Lc0v17 20230:
 5/13   00:01    5,730  3,804   +0,29   d6-d5 Qe4-e5 0-0 Nb1-c3 c7-c6 Qe5-g3 Be7-d6 Qg3-h4 Qd8xh4
 6/13   00:02    9,023  4,028   +0,30   d6-d5 Qe4-f4 0-0 Nb1-c3 Nb8-c6 d2-d4 Nc6-b4 Bf1-d3 Nb4xd3+ 
 6/14   00:02    11,712 4,132   +0,32   d6-d5 Qe4-e5 0-0 Nb1-c3 Nb8-c6 Qe5xd5 Qd8-e8 Bf1-c4 Nc6-b4 
 ..............
 ..............
 12/28  01:10    333,978    4,750   +0,58   d6-d5 Qe4-e5 Nb8-c6 Qe5xg7 Be7-f6 Qg7-h6 Qd8-e7+ 
 12/29  01:30    425,028    4,721   +0,61   d6-d5 Qe4-e5 Nb8-c6 Qe5xg7 Be7-f6 Qg7-h6 Qd8-e7+ 

[Event "?"] [Site "?"] [Date "????.??.??"] [Round "?"] [White "?"] [Black "?"] [Result "*"] [ECO "C42"] [WhiteElo "2400"] [BlackElo "2000"] [PlyCount "11"] [TimeControl "900+5"]

  1. e4 e5 2. Nf3 Nf6 3. Nxe5 d6 4. Nf3 Nxe4 5. Qe2 Be7 6. Qxe4 *

Yet, perhaps this is not a real issue and just test20 has a very different mapping of the evaluation scores and this tiny looking +0.60 to correspond to +5.0 for test10 nets. Since before the capture she evaluated the position as +0.13 so after forcing 5...Be7?? that allows the Knight capture for free, she indeed feels her position got a lot worse. And perhaps this will improve with time and these tiny scores in a LOST position will get a lot larger. This will be an issue if it doesn't happen.

Furthermore 20230 net doesn't want to play even for a moment the 5...Be7?? move that gives then Knight. 3 nets before, the 20227 net (that fischerandom got this game by playing against it at 2000 nodes per move) wanted to play the abysmal Be7 move at 2000 nodes but after 20000 nodes it stabilizes to the correct 5...Qe7 move not giving the Knight! But 20227 net was just a recovering net of the "big spike" so i guess it's normal. So i guess it's not a real issue, only the very tiny eval in dead lost position is....

oscardssmith commented 6 years ago

Some default settings are not good for high node counts. cPUCT in particular should be higher than default. Have you tried using the CCCC settings (but with table) and seeing if that fixes it?

GeorgeMJ23 commented 6 years ago

About the game Leela-Fizbo at CCCC:

[Event "CCCC 1: Rapid Rumble (15|5) Stage 1"]
[Site "Chess.com"]
[Date "2018.09.09"]
[Round "?"]
[White "Lc0 17.11089"]
[Black "Fizbo 1.9"]
[Result "1-0"]
[ECO "D31"]
[WhiteElo "2400"]
[BlackElo "2400"]
[PlyCount "254"]

1. d4 d5 2. c4 e6 3. Nc3 c6 4. e4 dxe4 5. Nxe4 Bb4+ 6. Nc3 Nf6 7. a3 Bxc3+ 8.
bxc3 Nbd7 9. Nf3 O-O 10. a4 c5 11. a5 b6 12. Bd3 bxa5 13. O-O Bb7 14. Re1 Rc8
15. Ne5 cxd4 16. cxd4 Nxe5 17. Rxe5 Ba6 18. Ra4 Nd7 19. Bg5 Qb6 20. Re1 Qb7 21.
Be4 Qc7 22. c5 Rfe8 23. Ra3 Nf8 24. Qa4 Bb7 25. Bf4 Qxf4 26. Bxb7 Rcd8 27. Rd1
Re7 28. Bf3 Ng6 29. g3 Qc7 30. Rb3 Rc8 31. h4 Rd7 32. Rb7 Qxb7 33. Bxb7 Rxb7
34. Qxa5 Ne7 35. Qa6 Rcc7 36. Qa2 Nd5 37. Rb1 h5 38. Rxb7 Rxb7 39. Qa6 Rc7 40.
f3 Kh7 41. g4 hxg4 42. fxg4 g6 43. Kf2 Kg7 44. Qb5 Rc8 45. Qb7 Rc7 46. Qb8 Rd7
47. Qb5 Re7 48. Kg3 Rc7 49. Kf3 Kg8 50. Kf2 Kg7 51. Kg3 Rc8 52. Qb7 Rc7 53. Qb5
Rc8 54. Qb7 Rc7 55. Qb8 Rd7 56. Qc8 Nf6 57. g5 Ne4+ 58. Kh3 Rxd4 59. c6 Rd3+
60. Kg2 Nc3 61. c7 Rd2+ 62. Kh1 Rd1+ 63. Kh2 Rd2+ 64. Kg1 Rd1+ 65. Kh2 Rd2+ 66.
Kg3 Rd3+ 67. Kg2 Rd2+ 68. Kf3 Rd3+ 69. Kf2 Rd2+ 70. Ke1 Rd1+ 71. Kf2 Rd2+ 72.
Ke3 Rh2 73. Qf8+ Kxf8 74. c8=Q+ Kg7 75. Qxc3+ Kg8 76. Qd4 a5 77. Kf3 a4 78.
Qxa4 Rh1 79. Kg2 Rb1 80. Qd7 Rb2+ 81. Kf3 Rb3+ 82. Kf2 Rb2+ 83. Ke3 Rb3+ 84.
Ke2 Rb2+ 85. Kf1 Rb1+ 86. Ke2 Rh1 87. Qd4 e5 88. Qa4 Rh3 89. Kf2 Kg7 90. Qe4
Kg8 91. Kg2 Ra3 92. Qc6 Ra2+ 93. Kg3 e4 94. Qe8+ Kg7 95. Qe5+ Kg8 96. Qxe4 Ra3+
97. Kg4 Ra6 98. Qe8+ Kg7 99. Kf3 Re6 100. Qc8 Re7 101. Qc3+ Kg8 102. Qc6 Re6
103. Qa8+ Kg7 104. Kf2 Rd6 105. Qa1+ Kg8 106. Ke3 Re6+ 107. Kf4 Kh7 108. Qd4
Kg8 109. Kg4 Kh7 110. Qd8 Kg7 111. Qa8 Rb6 112. Qa1+ Kg8 113. Qa8+ Kg7 114.
Qa1+ Kg8 115. Qa5 Rd6 116. Qa8+ Kg7 117. Qa1+ Kg8 118. Qa8+ Kg7 119. Qa1+ Kg8
120. Kf4 Re6 121.Qb2 Kh7 122.Qb8 Kg7 123.Qc8 Rd6 124.Qc4+ K8 125.Qc8+ Kg7 
126.Qc3+ Kg8 127.Qc7 Re6 128.Qc8+ Kg7 129.Qb8 Ra6 130.Qe5+ Kg8 131.Qb8+ Kg7 
132.Qe5+ Kg8 133. Qc7 Re6 134. Qc8+ Kg7 135. Qa8 Re1 136. Qf3 Re6 137. Qb7
Kg8 138. Qc7 Kg7 139. Qd8 Ra6 140. Qd4+ Kg8

Leela = Lc0v17 cuda default settings, 11089 net, with GTX 1070 Ti, WITH 3,4,5,6 syzygy TBs, in infinite analysis mode. Leela with TBs would play 96.h5 and would win easily but: While Leela with TBs avoids 120.h5?? that draws that Leela on CCCC played, and prefers 120.Kf4, after following Leela's recommendations for both players: 120...Re6 121.Qb2 Kh7 122.Qb8 Kg7 123.Qc8 Rd6 124.Qc4+ K8 125.Qc8+ Kg7 126.Qc3+ Kg8 127.Qc7 Re6 128.Qc8+ Kg7 129.Qb8 Ra6 130.Qe5+ Kg8 131.Qb8+ Kg7 132.Qe5+ Kg8 now Leela wants to play 133.h5?? again. That just draws. This is with 3,4,5,6 TBs!!

It keeps 133.h5 up to 1.300.000 nodes with around 60.000 TB hits and then goes to 133.Qb5 and then to 133.Qc7 with around 230.000 TB hits.

Lc0v17 11089:
 8/17   00:05    54,696 9,500   +7,49   h4-h5 g6xh5 Qe5-b8+ Kg8-g7 Qb8-b2+ Kg7-g8 Qb2-b8+ Kg8-g7 Qb8-e5+ Kg7-g8 
 8/17   00:10    109,596    10,175  +7,32   h4-h5 g6xh5 Qe5-b8+ Kg8-g7 Qb8-e5+ Kg7-g8 Qe5-b8+ Kg8-g7 Qb8-b2+ 
 8/18   00:12    124,278    10,273  +7,38   h4-h5 g6xh5 Qe5-b8+ Kg8-g7 Qb8-b2+ Kg7-g8 Qb2-b5 Ra6-e6 Qb5-b8+ 
.............................................................
 10/23  01:55    1,312,493  11,375  +6,86   h4-h5 g6xh5 Qe5-b8+ Kg8-g7 Qb8-e5+ Kg7-g8 Qe5-b8+ Kg8-g7 Qb8-b2+ 
 10/23  01:59    1,349,621  11,322  +7,12   Qe5-c7 Ra6-e6 Qc7-c8+ Kg8-g7 Kf4-f3 Re6-d6 Qc8-c3+ Kg7-g8 h4-h5 
 10/23  02:04    1,406,903  11,327  +7,07   Qe5-c7 Ra6-e6 Qc7-c8+ Kg8-g7 Kf4-f3 Re6-d6 Qc8-c3+ Kg7-g8 h4-h5 
 10/24  02:05    1,416,822  11,323  +7,07   Qe5-c7 Ra6-e6 Qc7-c8+ Kg8-g7 Kf4-f3 Re6-d6 Qc8-c3+ Kg7-g8 h4-h5 
.............................................................
 12/28  04:14    2,842,807  11,149  +6,93   Qe5-c7 Ra6-e6 Qc7-c8+ Kg8-g7 Qc8-a8 Re6-e1 h4-h5 g6xh5 Qa8-a6 
 12/28  06:43    4,385,894  10,870  +6,93   Qe5-c7 Ra6-e6 Qc7-c8+ Kg8-g7 Qc8-a8 Re6-e1 h4-h5 g6xh5 Qa8-a6 

After 133.Qc7 following again Leela's recommendations for both players: 133...Re6 134.Qc8+ Kg7 135.Qa8 Re1 now want to play again 136.h5 even after 3.500.000 nodes and 190.000 TB hits but avoids it after 3.600.000 nodes and wants to play 136.Qf3 and then after 136...Re6 137.Qb7 Kg8 138.Qc7 Kg7 139.Qd8 Ra6 140.Qd4+ Kg8 And now there is nothing left than 2 moves for white to win since 146th move is approaching with 50 move rule draw. 141.Ke4 or 141.Ke5

Yet Leela after 4.200.000 nodes and 250.000 TB hits does not find either and wants to play Qd8+ that just draws! Is there any chance the TB implementation is broken? Maybe not since Stockfish has issues in recognizing this draw quickly and initially thought 141.Qd8+ wins also.

Lc0v17 11089:


 12/27  05:18    4,121,467  12,931  +6,64   Qd4-d8+ Kg8-g7 Qd8-e8 Ra6-e6 Qe8-a8 Re6-e1 h4-h5 g6xh5 Qa8-a5 Re1-e6 Qa5-c3+ Kg7-g8 Qc3-c7 Re6-g6 Kf4-f5 Kg8-g7 Qc7-e5+ Kg7-h7 Qe5-e8 Kh7-g7 Qe8-e2```
mooskagh commented 4 years ago

No blunders anymore.