official-stockfish / Stockfish

A free and strong UCI chess engine
https://stockfishchess.org/
GNU General Public License v3.0
11.2k stars 2.23k forks source link

Mate finding issue #5504

Open mstembera opened 1 month ago

mstembera commented 1 month ago

Describe the issue

In this CCC game https://www.chess.com/computer-chess-championship#event=ccc23-blitz-finals&game=385 Stockfish doesn't find mate till move 168 whereas Torch finds it a full 15 ply earlier on move 160.

Expected behavior

Consistent mate finding.

Steps to reproduce

Play through https://www.chess.com/computer-chess-championship#event=ccc23-blitz-finals&game=385

Anything else?

No response

Operating system

All

Stockfish version

dev-20240706-4d6e1225

peregrineshahin commented 1 month ago

I think this kind of unactionable issues are not the best, The expected behavior of Consistent mate finding is also off and unrepresentative, I don't see the problem here, SF just wanted more time to solve the mate, needing a little time is not an issue and I guess the matetrack results are still consistent.

mstembera commented 1 month ago

You may well be right. I just wanted to bring it to the attention of people who understand mate finding such as your self. I'm happy to close it if others agree.

vondele commented 1 month ago

Things can always be improved, but let's summarize what we have as data and what we have done so far.

First, we trace now since quite a while the mate finding performance of SF. In a dedicated repo, we have the tools to see how well SF finds mates: https://github.com/vondele/matetrack?tab=readme-ov-file#track-the-evolution-of-stockfish-mate-finding-effectiveness while we are currently a bit below the best, this seems to be a result of various commits, not one single particular. The impact of each commit can be extracted from https://github.com/vondele/matetrack/blob/master/matetrack1000000.csv

In the development version, each mate announcement is now, as far as we know, correct. This hasn't always been the case in SF, one can consider that a fixed bug. As a new feature, each mate announcement (in completed iterations) now comes with a PV that ends in mate. So no cutoffs or so in mating lines. I think that's pretty unique among top engines. I also think it is valuable, what's the value of a mate score if the user doesn't now it is correct, or the way to mate if unknown. Similar guarantees are now in place for TB scores as well. Decisive TB scores guarantee a TB win, the plies to entering the TB position (from the score) are correct. Given sufficient time, TB continuation lines to mate will be constructed.

Yet, of course, having a better matetrack scores is definitely something to keep an eye on, but shouldn't get in the way of Elo gains. Ideally we can learn what are patches that improve matefinding, and turn them into patches that both improve mate finding and Elo.

What I'm not 100% certain of yet, is how search in TB regime is influenced by the presence of TB. This is nothing new, but I suspect that scoring TB wins might sometimes make mate finding less efficient (this could be what is happening in the game you posted). Needs some careful look at how TB scoring in search actually happens once we have entered TB already.

vdbergh commented 1 month ago

What I'm not 100% certain of yet, is how search in TB regime is influenced by the presence of TB. This is nothing new, but I suspect that scoring TB wins might sometimes make mate finding less efficient (this could be what is happening in the game you posted). Needs some careful look at how TB scoring in search actually happens once we have entered TB already.

IIRC the TB scores are treated as bounds (e.g. a TBWin is a lower bound) instead of as exact scores. They may give cut-offs but they don't forcibly stop the search.

peregrineshahin commented 1 month ago

IIRC the TB scores are treated as bounds (e.g. a TBWin is a lower bound) instead of as exact scores. They may give cut-offs but they don't forcibly stop the search.

Well in the case we see in the game, the game plies that didn't report mate reported +200 so we see TB in root with a score of +200 for couple of plies this means that the search wasn't influnced by TB hits since we stopped looking for them, when the root is in TB we only sort the root-moves in such case and the score is masked when printing the PV and not in search, so I can say that the search was at least most likely similar to the search without TB with good confidence theoritcally. But practically, there is indeed a place that hurt mate-finding in the presence of TB.. It's the fact that TB entries are prioritized in the TT above mates with the boost of depth + 6, so a TT full of those TB scores that doesn't get overwritten easily is a problem. https://github.com/official-stockfish/Stockfish/blob/a8401e803d37ec7dbf0650f4d79475214655477e/src/search.cpp#L691-L698