ddugovic / Stockfish

Retired multi-variant fork of popular UCI chess engine; please use Fairy-Stockfish instead
https://github.com/ianfab/Fairy-Stockfish
GNU General Public License v3.0
132 stars 44 forks source link

Experiment with ideas from jhellis3/mate_finder #95

Closed Vinvin20 closed 7 years ago

Vinvin20 commented 7 years ago

https://fr.lichess.org/VZHy5KHR#15, Nf7? is mate in 9 after Qxf7+.

That's a case of too much selectivity from SF (or lack of extension when in danger).

All moves by white is played with check, it's strange that SF doesn't see the mate. May be rise the check extension depth ?

ianfab commented 7 years ago

Checks are only extended if the SEE score is greater or equal to zero. To completely remove this condition might be too much, but I think it could make sense to extend all checks that can not be blocked (knight checks and contact checks) and hence are more forcing. I'll do some tests.

Vinvin20 commented 7 years ago

Yes, good idea !

isaacl commented 7 years ago

@ianfab How do you test stockfish strength? Does stockfish have capability to import top games to tune analysis? I'm interested in helping out.

ianfab commented 7 years ago

@isaacl: I use a python script that uses (an outdated version of) python-chess to communicate with Stockfish and test patches in self play. Move legality is not tested for, so I can let it play all variants assuming that Stockfish's move generation is working properly. It might not be optimal, but it is working well so far considering the strength improvements in most variants.

For tuning I use a modified version of Stockfish's SPSA tuner. It does a great job, since all piece values, i.e. also Crazyhouse piece values, have been tuned starting from values that were far from the optimum (in most cases the piece values of standard chess).

isaacl commented 7 years ago

very cool, thanks

ianfab commented 7 years ago

@Vinvin20 Both versions (entend all or only unblockable checks) did not pass.

Vinvin20 commented 7 years ago

Probably not enough depth in other lines :-/ Please publish the final results (score in %).

ianfab commented 7 years ago

Extend checks even if SEE score is negative: LLR: -2.99 (-2.94,2.94) [0.00,20.00] Total: 1490 W: 709 L: 716 D: 65

Only extend checks with negative SEE if it is a knight or contact check: LLR: -2.98 (-2.94,2.94) [0.00,20.00] Total: 1332 W: 626 L: 637 D: 69

No regressions, but no improvements either.

Vinvin20 commented 7 years ago

[0,20] is enormous, that can lead to false conclusions. Ratings point of view : 1) 400/1490x(709-716) = -1.9 points 2) 400/1332x(626-637) = -3.3 points

So few points to not lose against opponents 300 Elo below is OK for me. Because it plays more solid chess.

ianfab commented 7 years ago

I do not think it is a good idea to apply changes to the search without any evidence that it is an improvement (or at least a simplification). Furthermore, it might well be that this oversight was caused by the hash collision issue (that has already been fixed), since I cannot reproduce it with the most recent version.

Vinvin20 commented 7 years ago

OK, good news !

Vinvin20 commented 7 years ago

With the new version, another forced mate with unblockable checks (hopefully the human didn't see it until the conclusion). Look at the analyze graph : https://fr.lichess.org/a4XGGUAO#76

  1. P@a5+ Ka7 41. P@b6+ Bxb6 42. axb6+ Kxb6 43. Nxd5+ Kc6 44. Rf6+ R@e6 45. Rxe6+ B@d6 46. Rxd6+ Kb5 #2 This kind of attacks by desperation from human is a weakness for SF-zh from my point of view ... ("unblockable" or "blockable by an undefended piece" should have the same definition here).
Vinvin20 commented 7 years ago

One more forced mate with unblockable checks : https://fr.lichess.org/04lW53Qd#70

  1. d7+ Rxd7 37.N@f6+ gxf6 39. N@g7+ Kd8 39. Rxd7+ Bxd7 40.p@c7+ mate in 4
Vinvin20 commented 7 years ago

One more human mate with only unblockable checks : https://fr.lichess.org/9RZpLOfK#61

ianfab commented 7 years ago

I think that this simply is a problem arising with the large branching factor and it is not that surprising that it does not find a mate in 10+. However, if you encounter positions where it does not find a mate in 5 or the like, I would be very interested in these.

Vinvin20 commented 7 years ago

I still to think it would be better to extend more lines with unblockable checks. It's clearly the area where humans score the greater number of victories against SF level 8. I promise you : a better score against humans, a more stable score through moves and nicer fireworks during games !

isaacl commented 7 years ago

In my experience, blockable checks are often better play. So not sure it makes sense to specifically look for unblockable lines.

Vinvin20 commented 7 years ago

Unblockable checks are, for humans, easy to detect, repeatable and lead to a lost game. That's called a "weak spot" and when a human finds a weak spot, he can exploit it over and over again (and better and better).

ddugovic commented 7 years ago

Stockfish already has a mechanism for detecting repeatable threats.

Vinvin20 commented 7 years ago

Killers system tries some moves before others but don't go deeper to find forced mates as humans do.

ddugovic commented 7 years ago

Stockfish already has a mechanism for going deeper to find forced mates like humans do.

Vinvin20 commented 7 years ago

One game that shows there's trouble with eval or extensions (the score should be more stable that shown here ) : https://fr.lichess.org/wyIj4qQX/black#97

Vinvin20 commented 7 years ago

ddugovic commented a day ago Stockfish already has a mechanism for going deeper to find forced mates like humans do.

BTW Stockfish underestimates many winning sacrifices in regular chess. That's even more annoying in Crazyhouze because there's even more winning sacrifices !

Vinvin20 commented 7 years ago

One more yesterday : https://fr.lichess.org/xwhYnIKo#49 And it gives a wrong analyse, SF suggest that 25... p@g2 is way better but it's mate with the same threats because of unblockable checks (not managed by SF).

ddugovic commented 7 years ago

IMHO that's an excellent example of the horizon effect and possibly reason to tune search extension - possibly in response to a drop blocking a check.

ddugovic commented 7 years ago

In theory it could be cause to make a smarter transposition table in terms of lower bounds since 25... P@g5 26. Rxg5+ and 25... P@g2 26. Rxg2+ P@g5 27. Rxg5+ are the same position.

Vinvin20 commented 7 years ago

Same position on the board but not in hand. But when there's unblockable checks, the hand of the defender has no effect.

ddugovic commented 7 years ago

Even if the checks can be blocked, transpositions where the defender has worse pieces in hand must be equal or worse.

Vinvin20 commented 7 years ago

I don't know hashtables working with a "part" of a position. Extend more checks seems easier. Or reduce more when the position has less checks. Or may be it's impossible to exceed the current top human level at 0.5 seconds per move ...

isaacl commented 7 years ago

transpositions where defender has worse pieces in hand must be equal or worse.

@ddugovic Do you mean fewer of the same piece types? Hard to make characterizations about different pieces in hand otherwise, it's too positionally dependent. Queen in hand is usually better than pawn in hand, for example, but not always.

ddugovic commented 7 years ago

Yes, I meant fewer pieces of the same piece types (and now I see that there are some positions where you need your opponent to block using a specific piece so you can capture it then drop it).

Vinvin20 commented 7 years ago

I just remember a thing about mate threats, there's a fork of Stockfish way smarter in finding mates : http://www.talkchess.com/forum/viewtopic.php?t=61932 If you could apply this changes to the current SF-zh that would probably solve a lot forced checkmates ! https://github.com/jhellis3/Stockfish/tree/mate_finder

ddugovic commented 7 years ago

That's a separate issue and I'm seriously considering it although it will weaken Stockfish.

stockfishdeveloper commented 7 years ago

Mmmmm... I'm not sure. SF Matefinder is usually about 40-50 ELO weaker than regular SF. But it has been optimized for finding tactical shots and long mates. I'm going to examine the diff from Matefinder SF to regular SF and try to extract the ideas it uses. I'm guessing that it's going to be worth maybe 100 ELO at crazyhouse.

ianfab commented 7 years ago

In my experience Stockfish heavily relies on aggressive pruning in crazyhouse, so I am not sure whether finding some mates a little bit earlier compensates for the increase of the branching factor. But I am more than happy if someone proves me wrong and finds an improvement with these ideas.

Vinvin20 commented 7 years ago

I'm guessing that it's going to be worth maybe 100 ELO at Crazyhouse.

I hope so. It's a complete different world : in Crazyhouse, top amateurs are currently 100 to 200 points (not enough results with long time control) over SF-zh and in regular chess, SF is 300 to 400 over top professionals .

Vinvin20 commented 7 years ago

I'm going to examine the diff from Matefinder SF to regular SF and try to extract the ideas it uses.

Please keep me in touch when you have some results against computers and some analyses of tactical games. Results against strong humans would be the most useful but it's not simple to get them.

stockfishdeveloper commented 7 years ago

Ok. I'm working on it.

Vinvin20 commented 7 years ago

Another simple mate in 4 overlooked : 16.P@e7?? Nh3!! :-( https://fr.lichess.org/X9YlDNv2/black#31

ddugovic commented 7 years ago

I cannot duplicate that blunder from a command line, and certainly during postmortem analysis Stockfish has no trouble finding it:

setoption name UCI_Variant value crazyhouse
position fen 3r1rk1/ppp4p/3p1n2/7b/4P3/8/PP3PPP/RNB2RK1[QBBPPnnqppp] w - - 30 16
go movetime 800 depth 21
Vinvin20 commented 7 years ago

The local engine keep p@e7? until depth=9 and stay a long time in depth=9.

ddugovic commented 7 years ago

"Local Stockfish" is coded in JavaScript...

This depth=9 test instantly rejects P@e7:

setoption name UCI_Variant value crazyhouse
position fen 3r1rk1/ppp4p/3p1n2/7b/4P3/8/PP3PPP/RNB2RK1[QBBPPnnqppp] w - - 30 16
go movetime 800 depth 9
Vinvin20 commented 7 years ago

"Local Stockfish" is coded in JavaScript...

I suppose there's the same logic (search + eval) in the code.

Vinvin20 commented 7 years ago

In Chrome, the JS engine plays B@c4 at depth 8 and in Firefox, the JS engine plays P@e7 at depth 8

There's a bug somewhere !

ddugovic commented 7 years ago

Indeed, I suppose someone ought to enter an issue into Mozilla's tracker...

In all seriousness, "Local Stockfish" isn't Stockfish and I have no control over it.

Vinvin20 commented 7 years ago

Who does "compile" to JavaScript ?

ianfab commented 7 years ago

@Vinvin20 https://github.com/niklasf/stockfish.js

isaacl commented 7 years ago

cc @niklasf

"Local stockfish" should have similar behavior to trunk. On chrome it's a pnacl build, emscripten version is for other browsers.

The likely discrepancies:

It's possible to set breakpoints in chrome to monitor communication between stockfish and the browser. That's most accurate info for debugging a miseval.

fwiw I don't see the reported issue, on my chrome I get N@h3.

Vinvin20 commented 7 years ago

The latest version (2016-11-30) seems OK under Firefox now.

ddugovic commented 7 years ago

Here is a difficult test position (mate in 30 plies):

setoption name UCI_Variant value crazyhouse
setoption name Threads value 4
position fen r5k1/pppqbrp1/2n3Bp/3p1n1p/4p3/1PN1P2B/P1PP2PP/R1B2RK1[NPq] b - - 33 17
go infinite