Closed isaacl closed 7 years ago
Duplicated (that Stockfish discovers the mate around depth=17 time=30000 to time=40000 on my PC):
info depth 17 seldepth 29 multipv 1 score mate 8 nodes 65299668 nps 1534404 hashfull 999 tbhits 0 time 42557 pv e7h4 b3f7 f8f7 N@f6 h4f6 N@e7 d8e7 h5f6 e7f6 N@e7 f6e7 B@f6 d2f4 f6g5 e7g5
Black mates in 8 starting with a quiet move in a position where its king is exposed and White has 6 pieces in hand.
Wow...it's incredible that Stockfish can see that given the nature of the position!
It's funny that #95 improperly implemented could make it more difficult to solve positions like this one involving one or more quiet moves (although technically 1... Bxh4 is a capture).
But it's not an issue. f7b3
wins also (mate in 10 or 11 - not sure now). Finding a longer but easier win first is normal.
It's funny that #95 improperly implemented could make it more difficult to solve positions like this one involving multiple quiet moves.
Well, you know that Stockfish Matefinder is weaker. Why wouldn't matefinding hurt strength for crazyhouse as well?
Good question, and we don't know the answer to that yet. Maybe somehow crazyhouse chess is special and/or some aspects of what jhellis3 developed can be made useful in crazyhouse?
To be honest I was tempted to immediately close #95 but inevitably someone will open an issue just like it.
Note that the mate is only that long because of futile checks by white. When white runs out of checks and pieces in hand, he's mated in a few moves. If SF has to play that on board, it will find the mating moves. Before that, calculating the mate is a waste of time; confidence that it's there is enough. So I'd say that this is definitely a non-issue.
Unless you want a special matefinding mode. But that's discussed in #95.
Oh, that's an interesting point: this position is not the best possible test position, since even if SF doesn't see the mate it can still manage to win (regardless of the evaluation).
Honestly a special matefinding mode could be interesting (and maybe I should maintain a mate_finder branch).
Well, you know that Stockfish Matefinder is weaker. Why wouldn't matefinding hurt strength for crazyhouse as well?
Probably because of the nature of the games :
In chess, you've to play positionally, trying to improve your position and get a better position than your opponent then win a pawn (or more), then win the endgame with the small advantage.
Crazyhouse is way more tactical and a forced mate in 20 can appears when both king are under heavy attack (a pawn or a piece more can be irrelevant) and finding the mate variation is the key to win the game.
For what it's worth, I just created a mate_finder branch.
finding the mate variation is the key to win the game.
No. If you can GUESS which variations win or lose, you don't need to calculate. Example: If only check evasions you have just lose material for nothing, the position is in all likelihood lost. If remaining depth is low, it should just be considered lost without further consideration.
Sometimes yes, sometimes no.
There's no doubt that SF-Matefinder is stronger than Regular-SF in tactical positions Based on my test : http://www.talkchess.com/forum/viewtopic.php?p=673942#673942
On 2 runs, SF-MateFinder : found at least one time : 96 solutions ! On 3 runs, Stockfish_160520_x64_modern_fast scored only 79 !
And crazyhouse is more tactical than chess !
@Vinvin20 You should not confuse finding the best move with playing strength. If you want to measure playing strength in tactical positions, you should play out these positions with the two version. The versions that has the better score can than be considered to be better in tactical positions. An engine does not necessarily have to find the best move as long as it finds a winning move.
I understand that if you use an engine for analysis you want it to find such checkmate combinations, but this is not essential for a strong engine. However, it would of course be nice.
It's a tactical set, so, no doubt the one who finds the solution will win the game.
Sorry, I haven't looked at the positions. If the positions are either completely won or lost depending on one move, you are right. However, I would be more interested in seeing results for positions where you have a winning combination, but if you do not see it, the position still is about equal, because this is what usually is the case in real games and hence would, in my opinion, be a better indicator for playing strength.
It's a tactical set, so, no doubt the one who finds the solution will win the game.
That implies there is only one solution. OTB it is common for a player to pass up a mate in N yet play quite strongly.
In crazyhouse there's about no "equal positions". Eval is often chaotic and you've to take your chance when you've an attack ...
Game theoretically it might well be that there are relatively few positions in crazyhouse that are a draw. However, we are so far away from perfect play in crazyhouse, that, in my opinion, it is pointless to talk about game theoretical values. Of course, there is no doubt that there is a lot of room for improvements on pruning, extensions, etc. in crazyhouse, but I am not entirely convinced that the methods of the matefinder are the right path for crazyhouse unless there are convincing test results.
I too am deeply skeptical that the matefinder ideas will improve performance in anything other than extremely difficult puzzles involving quiet moves. My test results with mate_finder show that it is slower.
I hope you are right ;-)
The reasons I posted this:
@isaacl Yes, pieces in hand have slightly different piece values. However, your comment gave me the idea of evaluating pieces in hand based on the board position, e.g. adding a bonus for pawns in hand if there are empty squares on the 7th rank, if pieces in hand can be dropped with check, etc.
@ianfab Please to post your new code or description - I don't want to test conflicting changes.
However, your comment gave me the idea of evaluating pieces in hand based on the board position, e.g. adding a bonus for pawns in hand if there are empty squares on the 7th rank, if pieces in hand can be dropped with check, etc.
Well, king safety evaluation in mainline Stockfish counts safe checks possible; but it ignores drops. Perharps accounting for drops but not otherwise changing the logic would be a good idea?
@sf-x I currently do not write or test any new code, because I am busy with creating a version of Stockfish that generates EPD opening books for testing.
I had already tested this idea a few days ago, but it failed (I can search for the results later if you are interested), probably because it overvalues pieces in hand, since they usually have many possible checks. However, when I wrote it, I did not really think of extending this idea to other parts of the evaluation. It might be worth thinking about adding a function similar to evaluate_pieces
for pieces in hand.
I can search for the results later if you are interested
Yes please.
probably because it overvalues pieces in hand, since they usually have many possible checks.
Another possible reason is time control being too short to calculate out the attacks.
I have found the results:
LLR: -2.99 (-2.94,2.94) [0.00,20.00]
Total: 656 W: 292 L: 322 D: 42
Well, king safety evaluation in mainline Stockfish counts safe checks possible; but it ignores drops. Perhaps accounting for drops but not otherwise changing the logic would be a good idea?
Maybe penalize king safety based on drops on safe squares; for example in this position at the end of a forcing variation, Black safely plays N@f3+ and White is hopelessly doomed.
Safe could mean any of:
Adding this mate in 15 puzzle here since #95 is resolved (I created a mate_finder branch):
setoption name UCI_Variant value crazyhouse
setoption name Threads value 4
position fen r5k1/pppqbrp1/2n3Bp/3p1n1p/4p3/1PN1P2B/P1PP2PP/R1B2RK1[NPq] b - - 33 17
go infinite
Hi, just wanted to comment that at least one aspect of MF probably hurts much worse than normal in zh, and that is the TT changes. You only get 2/3rds the TT entries when using the full key, and due to zh having a naturally higher branching factor, there is probably much more hash pressure for any given depth. If you have more hash than you can use in the average move time, the full keys can actually gain 1-2 Elo, but as soon as you get hash pressure, performance and Elo drops considerably. Not pruning in low material situations (in futility and null pruning) is also probably pretty useless in zh, since it is extremely unlikely for a pawn race to decide a zh game.
Thanks @jhellis3 ! When reading the code I was thinking the same thing but merging the changes minus the TT changes seemed messy. I figured I would merge all the changes to ddugovic:Stockfish/mate_finder
and regression test before reverting anything or making more branches.
3 more games I found while browsing longest games from my recent match 1core vs 6cores. That confirms my thinking that SF should have a better mate finder algorithm :
At moves 77 : https://fr.lichess.org/p36l0TmZ#154 Around move 65 : https://fr.lichess.org/rWTzvckw#129 Around move 66 : https://fr.lichess.org/v1SjNvzV#132
Having the test positions is helpful. Let's not jump to conclusions about whether the algorithm, parameters, or something else is the issue.
A condition like this one may help SF-zh in mating net : "No LMR if in danger of getting mated (like pruning)" https://github.com/locutus2/Stockfish/compare/a47bbca...ba6bf2b
Maybe. Someone would need to test it following directions in #149.
The test is already running. So far the results are not very promising, but I will wait for the test to finish and then post the final results.
@Vinvin20 Here are the results for the patch you mentioned:
LLR: -3.06 (-2.94,2.94) [0.00,20.00]
Total: 1024 W: 483 L: 505 D: 36
Thanks for the test ! But disappointed by the results ...
Severe changes (such as "always do X" or "never do X") tend to have severe effects. The result is unfortunate but not surprising.
It's a severe change but for a small part of the tree (where there are mates).
I have found you don't really need to worry about LMR when it comes to seeing actual mates. Where LMR changes can make a difference is on the path towards the mate if that path involves sacrificing material. The most significant mate detection change in MateFinder is the null move criteria alteration (the legal, and available king moves check). I would suggest testing that alone; if any part of MateFinder may be beneficial to zh, I would expect that to be it.
Thanks, I will test that alone now that I'm aware of what to isolate and test.
Hm... I'm looking at http://github.com/jhellis3/Stockfish/blame/9914d9ccf7cb57b63c81d897334dae6a8178c6c5/src/search.cpp and a bit confused where the LMR change for "legal & available king moves check" is done. Are we talking about Step 8?
// Step 8. Null move search with verification search (is omitted in PV nodes)
There is no LMR change I would recommend. The most valuable change is the condition in Step 8, for which you will also need the changes to movegen.h.
Sadly Step 8 alone doesn't perform: make profile-build
hangs. On current master I tried simplifying this to see if I can get make profile-build
to complete (although make build
works, search produces no output):
EDIT: Adding the missing parentheses allows make profile-build
to complete, but search in crazyhouse produces no output.
// Step 8. Null move search with verification search (is omitted in PV nodes)
...
if ( !PvNode
&& eval >= beta
&& (ss->staticEval >= beta - 35 * (depth / ONE_PLY - 6) || depth >= 13 * ONE_PLY)
#ifdef CRAZYHOUSE
&& (pos.is_house() ? (abs(eval) < 2 * VALUE_KNOWN_WIN
&& !(depth > 4 * ONE_PLY && (inCheck ? MoveList<LEGAL>(pos).size() < 6 : MoveList<LEGAL, KING>(pos).size() < 1))) :
pos.non_pawn_material(pos.side_to_move())))
#else
&& pos.non_pawn_material(pos.side_to_move()))
#endif
{
@ddugovic It probably fails because piece_on(from_sq(move))
does not work for piece drops. I think it should work if you replace it by moved_piece(move)
.
Why is inCheck ? there, that is not in my code? I can't really comment on something which is not my code and where I can not see a diff.
Thanks, that's much better. I am now testing with:
EDIT: Fixed typo below. Damn, writing code is difficult.
// Step 8. Null move search with verification search (is omitted in PV nodes)
...
if ( !PvNode
&& eval >= beta
&& (ss->staticEval >= beta - 35 * (depth / ONE_PLY - 6) || depth >= 13 * ONE_PLY)
#ifdef CRAZYHOUSE
&& (pos.is_house() ? (eval < 2 * VALUE_KNOWN_WIN
&& !(depth > 4 * ONE_PLY && (MoveList<LEGAL, KING>(pos).size() < 1 || MoveList<LEGAL>(pos).size() < 6))) :
pos.non_pawn_material(pos.side_to_move())))
#else
&& pos.non_pawn_material(pos.side_to_move()))
#endif
{
In desperation I added inCheck
hoping that less code would be executed. I have reverted that change, although honestly in crazyhouse if MoveList<LEGAL>(pos).size() < 6
is true and inCheck
is false, then very likely pos.non_pawn_material(pos.side_to_move())
is zero and the player's position is a disaster.
I removed the pos.non_pawn_material(pos.side_to_move())
check because in positions of interest, both players have at least 1 piece.
Unfortunately this change does not scale. Next I shall try replacing MoveList<LEGAL, KING>(pos).size() < 1 || MoveList<LEGAL>(pos).size() < 6
with MoveList<LEGAL, KING>(pos).size() < 1 || (inCheck && MoveList<LEGAL>(pos).size() < 6)
since some positions have over 100 legal moves and legal move generation is expensive:
==> test20161221-1.out <==
Score of PATCH vs Stockfish 211216 64 BMI2: 248 - 233 - 19 [0.515] 500
Elo difference: 10.43 +/- 29.91
==> test20161221-10.out <==
Score of PATCH vs Stockfish 211216 64 BMI2: 249 - 227 - 24 [0.522] 500
Elo difference: 15.30 +/- 29.77
==> test20161221-30.out <==
Score of PATCH vs Stockfish 211216 64 BMI2: 224 - 249 - 27 [0.475] 500
Elo difference: -17.39 +/- 29.68
==> test20161221-60.out <==
Score of PATCH vs Stockfish 211216 64 BMI2: 87 - 109 - 4 [0.445] 200
Elo difference: -38.37 +/- 48.20
AFAIK, inCheck should always be false (due to step 5), so you are effectively removing the criteria, which can be reduced down to just MoveList<LEGAL, KING>(pos).size() < 1.
At low depths and move times it evals the position as completely lost with f7b3, at higher depth it finds an
#8
with Bxh4.