ddugovic / Stockfish

Retired multi-variant fork of popular UCI chess engine; please use Fairy-Stockfish instead
https://github.com/ianfab/Fairy-Stockfish
GNU General Public License v3.0
132 stars 44 forks source link

Compare playing strength with winboard engines #88

Closed ianfab closed 7 years ago

ianfab commented 7 years ago

Is there an easy way to let Stockfish play chess variants against winboard engines to be able to compare the playing strength? This would be especially interesting for crazyhouse and antichess.

Or are there any other UCI engines that are able to play variants?

stockfishdeveloper commented 7 years ago

Yes, there's a program to allow UCI and Winboard engines play together. See http://wbec-ridderkerk.nl/html/details1/PolyGlot.html

ddugovic commented 7 years ago

Forever ago I had an environment set up for this sort of testing. man xboard explains how to invoke WinBoard/xboard using PolyGlot as an engine (and PolyGlot is configured to select an engine) and you can play it against a different engine or with FICS permission create a computer account and play against FICS opponents.

ianfab commented 7 years ago

Thanks for your answers. I can let stockfish play normal chess against other engines with the xboard GUI via polyglot, but it does not allow stockfish to play other variants. Is there a way to make it think stockfish is able to play variants or to disable this check?

stockfishdeveloper commented 7 years ago

Well, set up Winboard for a variant. Then set Dan's version of Stockfish's UCI parameter to the variant of your choice. But I'm assuming this "variant Stockfish" has that UCI option for variants. Now that I think about it, it's a #define that determines what variant the compile plays, right?

ddugovic commented 7 years ago

I think there are five solutions:

  1. Per this comment apparently there's a way for the WinBoard/xboard UI to specify a variant. Ultimately it seems necessary to specify the variant somehow since WinBoard/xboard also enforces legal moves. I don't recall how to specify to WinBoard what variant to play.
  2. WinBoard/xboard has numerous command line options . You're probably looking for initString.
  3. I think polyglot.ini also allows for engine options and/or can specify them in the engine command
  4. Worst case, wrap the engine using a Python, Ruby, or other script which upon receiving a position command can set the variant command to the engine, followed by the position command; or somehow code/compile a custom Stockfish which plays a variant by default.
  5. Consider using a different UI such as python-chess or Arena Chess GUI, although I haven't used these in an equally long time.

Now that I think about it, it's a #define that determines what variant the compile plays, right?

@stockfishdeveloper No, see this closed issue for how to specify what variant the engine plays.

ianfab commented 7 years ago

I should have been more precise about the problem. It is no problem at all to set the UCI_Variant option. The problem is that polyglot sends the available variants to xboard (in response to protover 2), but only includes normal and chess960. Therefore, xboard does not recognize that stockfish is able to play variants and does not allow to start a variant game with stockfish.

I already use (an outdated and modified version of) python-chess to test stockfish in self play. However, as far as I know python-chess does not support the xboard/winboard protocol. I might try Arena even though I am not very familiar with compiling the engines on windows.

I found this thread about this issue and your try to fix it.

ddugovic commented 7 years ago

Ah, I had completely forgotten that my PolyGlot fork implemented the entire atomic chess ruleset, and for crazyhouse & other variants similar rule changes would need to be made. Honestly, I don't know how to compile for Windows either.

I've not used PyChess or CuteChess, but they may also be options?

ianfab commented 7 years ago

I had also tried Pychess, but there the exact same problem occurs that it does not let stockfish play variants. As far as I know cutechess does only support atomic and losers.

So I probably have to modify one of PyChess, CuteChess, PolyGlot, and python-chess. The easiest one might be to overwrite the check in the PyChess code, so that it allows Stockfish to play variants. I will think about which one I will give a try.

Edit: CuteChess seems to also support Crazyhouse, and King of the Hill and Racing Kings seem to be added soon. I will try it.

stockfishdeveloper commented 7 years ago

I guess it's going to be a while before variants take off.

ianfab commented 7 years ago

I got it to work with PyChess and CuteChess. Both seem to use UCI_Crazyhouse to detect whether Stockfish it is able to play crazyhouse [Edit: more recent versions also support UCI_Variant]. An additional problem with crazyhouse is that Stockfish (so far) uses lowercase letters to represent black piece drops. I changed it and was able to play games in both PyChess and CuteChess.

Now it is getting really interesting. Stockfish played a game with black against each Sjeng and Sunsetter and it won both games convincingly in 24 and 30 moves, respectively. It even announced a mate in 10 in the game against Sunsetter. I will do some more testing to see whether it is really that strong.

Here the quite interesting game against Sunsetter:

[Event "Sunsetter - Stockfish"]
[Date "2016.10.31"]
[Round "1"]
[White "sunsetter"]
[Black "Stockfish 301016 64"]
[Result "0-1"]
[TimeControl "60+0"]
[Variant "Crazyhouse"]
[PlyCount "60"]

1. d4 Nf6 2. Nf3 d5 3. Nc3 e6 4. Bg5 Bb4 5. e3 Nbd7 6. Rb1 Bxc3+ 7. bxc3 N@e4
8. B@a5 b6 9. Bxf6 Qxf6 10. Bb4 a5 11. Ba3 Nxc3 12. N@b5 Nxb5 13. Bxb5 N@c3 14.
N@g4 Qf5 15. Nh6 gxh6 16. Bxd7+ Kxd7 17. Ne5+ Qxe5 18. dxe5 Nxd1 19. Q@b5+ Kd8
20. N@c6+ Ke8 21. Rxd1 Q@c3+ 22. Rd2 Qxd2+ 23. Kxd2 N@e4+ 24. Kc1 R@a1+ 25.
N@b1 B@d2+ 26. Kb2 B@c3+ 27. Kb3 a4+ 28. Qxa4 N@c5+ 29. Bxc5 Nxc5+ 30. Ka3 Rxa4# 0-1

Edit: After 50 games with a time control of 10 seconds/40 moves, Stockfish is leading 47 - 3 against Sunsetter. I am going to redo the test with the most recent Sunsetter version.

ddugovic commented 7 years ago

Excellent! The wins over Sjeng are especially surprising given how active I presume its development has been...

I'll ask around for crazyhouse test positions or puzzles.

ianfab commented 7 years ago

According to this rating list, Sunsetter is much better than Sjeng.

Result of Stockfish vs. Sunsetter 9: 44 - 5 - 1 (W - L - D)

It is exciting to beat one of the top crazyhouse engines by such a huge margin. Now I will try TJchess, which supposedly is stronger than Sunsetter.

ianfab commented 7 years ago

Testing with CuteChess resulted in: Score of Stockfish 301016 64 vs TJchess 1.3-x64: 44 - 6 - 0

I am stunned. I expected Stockfish to be good in Crazyhouse, but to crush the top engines with only very few changes regarding the search and evaluation compared to standard chess is unexpected.

@niklasf,@ornicar: I think it might now be interesting to use Stockfish instead of Sunsetter for Crazyhouse.

ddugovic commented 7 years ago

This result is plausible because Stockfish move ordering is good; review movegen.cpp for how it generates drop moves first when generating subsets of moves, and movepicker.cpp for the killer move heuristic.

stockfishdeveloper commented 7 years ago

This is indeed astonishing. You're doing great work guys!

niklasf commented 7 years ago

Since simply enabling one more variant for Stockfish should be rather easy, we might as well enable it right away.

/cc @georgvonzimmermann

ddugovic commented 7 years ago

An additional problem with crazyhouse is that Stockfish (so far) uses lowercase letters to represent black piece drops. I changed it and was able to play games in both PyChess and CuteChess.

@ianfab @niklasf Would it be better or worse to use uppercase? (I assume uppercase is a standard somehow in the same sense that uppercase is used for piece moves.)

niklasf commented 7 years ago

ultimately it's no big deal, but i believe uppercase (e.g. B@e4) is the way to go.

ianfab commented 7 years ago

I agree that we should use uppercase. I just opened a pull request.

ddugovic commented 7 years ago

Merged #91 .

georgvonzimmermann commented 7 years ago

Congratulations guys, this is excellent !

Stockfish nicely shows the power of collaboration :-). Just think about how much work of how many really excellent programmers went into it. Compare that to for example Sunsetter. Sunsetter was written by 2 people for the most part 10 years ago. While borrowing many concepts of chess programming from different sources we seldom looked at other peoples code. My guess is that each and every part (be it the transposition table, the move ordering, the move generation etc ..) of Stockfish is multiple levels above Sunsetter in terms of speed and quality of programming.

Crazyhouse chess doesnt need all that much evaluation knowledge, therefore Stockfish is not lacking much (if any) knowledge.

I am very much looking forward to seeing how strong Stockfish will become!! If you are interested and havn't done that yet, here is what I would try to make a former chess engine play crazyhouse chess even better:

Thanks to @niklasf for pointing me to this cool development.

Sorry about the very lengthy message, I just got very excited :-). Keep up the good work!

ianfab commented 7 years ago

@georgvonzimmermann: Thanks for your suggestions. The crazyhouse piece values have been tuned with Stockfish's SPSA tuner and I think they already are quite good. The search is subject to tuning sessions and tests (currently razoring, futility and move count pruning) and probably has a lot of room for improvements even though Stockfish's values also work quite well in crazyhouse.

ianfab commented 7 years ago

I have done new tests with an opening book and more games.

Conditions Games: 100 for each pairing Time control: 40 moves in 10 seconds Hardware: 1 thread on Intel Core i5-4210M Opening book: ccva-140start.pgn from CCVA

Results BayesElo with Elo offset of 2600:

Rank Name                  Elo    +    - games score oppo. draws 
   1 Stockfish 011116 64  2774   38   35   200   78%  2513    1% 
   2 TJchess 1.3-x64      2587   32   33   200   48%  2607    2% 
   3 Sunsetter9a00        2439   34   37   200   24%  2680    1% 

Games PGN file of games: games.pgn.txt

ornicar commented 7 years ago

stockfish now plays & analyses crazyhouse on lichess.org.

Here's a list of games recently played against level 8:

https://en.lichess.org/games/search?hasAi=1&aiLevelMin=8&aiLevelMax=8&perf=18&dateMin=1m&sort.field=a&sort.order=desc

ddugovic commented 7 years ago

Many thanks Georg! We appreciate the kind words.

Lichess Master Atrophied suggests that "if you can move, move; else drop". I'm unsure of the context but it might be interesting to see results if:

a) piece drops generate last, not first b) pawn drops generate last, not first c) both a and b

Maybe in some circumstances drops are favorable, based on how much material is in hand?

ianfab commented 7 years ago

Aren't quiet moves and drops sorted by their movepick score anyway?

ddugovic commented 7 years ago

Yes, you are correct @ianfab !

Closing as this is superceded by #131 !