official-stockfish / Stockfish

A free and strong UCI chess engine
https://stockfishchess.org/
GNU General Public License v3.0
11.56k stars 2.27k forks source link

Specialized NNUE trained on endgames? #3449

Closed Mr-Twave closed 3 years ago

Mr-Twave commented 3 years ago

I wanted to know whether anyone has delved deeply into the possibility of training specialized lightweight NNUE for 9-10 man endgames, and/or endgames with certain pieces. Such a net, or ensemble of could solve many of Stockfish's evaluation problems. It puzzles me why people are still using massive amounts of resources to testing net improvements that seem to barely gain any elo when one could simply create other nets to do some jobs better than the main net.

Obviously NNUE doesn't perform so well in games with unusual/adversarial piece counts (e.g. 3+ knights, 2+ queens, tripled or quadrupled pawns that are locked against an opponent's pawn) so it seems natural to me the logical conclusion to solving these problems is letting go of the idea that we need a "generalized net" and instead create specialized nets for unusual circumstances.

Some endgames qualify for unusual circumstances.

https://www.chess.com/computer-chess-championship#event=eco-megamatch-2-part-2&game=990

5k2/8/p7/r1p2p1B/P1p5/K1P4P/2P3P1/8 w - - 0 53 It takes a decent amount of Stockfish search and tablebases to consider this position as winning.

7k/5rr1/2P4p/7P/6P1/3Q3K/8/8 b - - 0 116 Stockfish seems to find good defensive lines for black, but it requires 6 or 7-man tablebases to get a decent evaluation and decent play from Stockfish.

3Q1n1k/6p1/8/5r1p/6PP/5P2/6K1/8 b - - 0 61 Lc0 knows there are ways to create fortresses here, while Stockfish has trouble evaluating this position to guarantee it wins. Tablebases help, but not significantly.

You may also note that many different rook and pawn endgames still manage to elude stockfish, while lc0 seems to play for more solid wins in those positions, as seen in several TCEC superfinals, regardless of whether NNUE was used or not.

OCB endgames have also proven a very notable issue and could potentially be addressed by using an OCB net.

vondele commented 3 years ago

yes, nets have been tried that have different params for different material on the board, so far with no success.

Sopel97 commented 3 years ago

Having whole separate nets, one for piece_count>=24, one for [16, 23], one for <16 yields +13 elo at 20k nodes per move with pure nnue eval but the slowdown of 5% completely offsets the gains. At one split at 20 pieces the elo gain is around 9, with 3% slowdown, which is better but still too much.

vdbergh commented 3 years ago

What is the reason for the slowdown?

vondele commented 3 years ago

probably (L1) cache related as some more network data needs to be loaded.

vdbergh commented 3 years ago

But if I read @Sopel97's write up on NNUE (extremely nice BTW) the idea would be to keep the "big matrix" the same but change the small matrices depending on material. That doesn't look to me as needing lot of extra data. Is it this that was tested?

Sopel97 commented 3 years ago

What is the reason for the slowdown?

Having to refresh the accumulator on when the piece count crosses the threshold for changning the nets

But if I read @Sopel97's write up on NNUE (extremely nice BTW) the idea would be to keep the "big matrix" the same but change the small matrices depending on material. That doesn't look to me as needing lot of extra data. It it this that was tested?

That's precisely what I'm doing now, see https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.l43b498nlpzf. Having 8 doesn't slow the engine down, but increasing this to 16 has a 2% slowdown for no further gain.

vondele commented 3 years ago

I think the idea has merit, and as said a few tests have been conducted already. I'll close this as an issue.

Some discussion on training networks is going on in the discord server https://discord.gg/bzB4Keey8D