Disservin / fastchess

fastchess is a chess cli tool to run engine vs engine matches
MIT License
87 stars 21 forks source link

Remaining differences between cutechess and fast chess game termination #587

Closed vondele closed 3 months ago

vondele commented 3 months ago

I ran the CI script for 20000 rounds (after increasing concurrency). The good news is fast-chess managed the games at 2x the speed of cutechess (16 vs 32min). However, The following differences remain, mostly on the priorities of various draws.

The line with < is cutechess. (It should be possible to get the corresponding FEN from the book for faster testing)

+ diff ./cutechess-cli-out.finished ./fast-chess-out.finished
282c282
< Finished game 282 (sf_2 vs sf_1): 1/2-1/2 {Draw by stalemate}
---
> Finished game 282 (sf_2 vs sf_1): 1/2-1/2 {Draw by insufficient mating material}
12544c12544
< Finished game 12544 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 12544 (sf_2 vs sf_1): 1/2-1/2 {Draw by adjudication}
12555c12555
< Finished game 12555 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 12555 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
14214c14214
< Finished game 14214 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 14214 (sf_2 vs sf_1): 1/2-1/2 {Draw by adjudication}
17393c17393
< Finished game 17393 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 17393 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
21811c21811
< Finished game 21811 (sf_1 vs sf_2): 1/2-1/2 {Draw by stalemate}
---
> Finished game 21811 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
25158c25158
< Finished game 25158 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 25158 (sf_2 vs sf_1): 1/2-1/2 {Draw by adjudication}
29756c29756
< Finished game 29756 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 29756 (sf_2 vs sf_1): 1/2-1/2 {Draw by adjudication}
32721c32721
< Finished game 32721 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 32721 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
32971c32971
< Finished game 32971 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 32971 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
37829c37829
< Finished game 37829 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 37829 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
37913c37913
< Finished game 37913 (sf_1 vs sf_2): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 37913 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
37978c37978
< Finished game 37978 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 37978 (sf_2 vs sf_1): 1/2-1/2 {Draw by adjudication}
39185c39185
< Finished game 39185 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 39185 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
39499c39499
< Finished game 39499 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 39499 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
39539c39539
< Finished game 39539 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
---
> Finished game 39539 (sf_1 vs sf_2): 1/2-1/2 {Draw by adjudication}
vondele commented 3 months ago

FYI, the ply count of all games seem to match. So, it is probably only a question of priorities.

gahtan-syarif commented 3 months ago

https://github.com/Disservin/fast-chess/pull/588

gahtan-syarif commented 3 months ago
< Finished game 282 (sf_2 vs sf_1): 1/2-1/2 {Draw by stalemate}
---
> Finished game 282 (sf_2 vs sf_1): 1/2-1/2 {Draw by insufficient mating material}

this one has to be fixed from chesslib side

Disservin commented 3 months ago

this one has to be fixed from chesslib side

all the functions are public i think, you should be able to just define our own in match.cpp/hpp which has the cutechess order

https://github.com/Disservin/chess-library/blob/master/src/board.hpp#L674

gahtan-syarif commented 3 months ago

this one has to be fixed from chesslib side

all the functions are public i think, you should be able to just define our own in match.cpp/hpp which has the cutechess order

https://github.com/Disservin/chess-library/blob/master/src/board.hpp#L674

i think its better if its done on the chess lib side so it wil be consistent, and stalemate is indeed a more important draw compared to insufficient material

robertnurnberg commented 3 months ago

Agreed. Fide rules of course allow playing on with insufficient material, while stalemate is gaming ending.

Disservin commented 3 months ago

mh i somewhat ordered the checks in the chess lib from cheapest to most expensive so i am not sure

Disservin commented 3 months ago

i'd anyway reimplement this here, since otherwise it would rely on the exact order of statements in that function which is something id like to avoid

Disservin commented 3 months ago

The good news is fast-chess managed the games at 2x the speed of cutechess (16 vs 32min)

At which TC? A while ago there was this discussion https://github.com/Disservin/fast-chess/discussions/212 which listed the following times:

tc=0.2+0.002
Score of eng1 vs eng2: 9966 - 9643 - 391 [] 20000
fast-chess - time 16 min 43.44 sec - [0.508]

Score of eng1 vs eng2: 10510 - 8607 - 883 [0.548] 20000
c-chess-cli - time 4 min 34.75 sec

Score of eng1 vs eng2: 9837 - 9687 - 476 [0.504] 20000
cutechess - time 6 min 45.32 sec

games arent identical here but the time for fastchess is very different than for the other game managers, could be worth a retest... i suspect the extensive use of mutex's can slow things down at a very short tc

vondele commented 3 months ago

This is for the setup as in the checking script, i.e. fixed depth 6 vs depth 8, and high concurrency (forgot, somewhere in the range 25-32 on a 16c/32t cpu). I could see fastchess at about 180% CPU, SFs at around 50-60%, cutechess at 2500% CPU, SFs at around 6-7% CPU.

With cutechess, most time was in 'system', which usually is IO, or mutex, or similar.

Probably one could use such a benchmark to profile where fast-chess is spending its time.

gahtan-syarif commented 3 months ago

tested it at 0.2+0.02:

-recover -repeat -games 2 -rounds 230 -tournament gauntlet -pgnout newformat.pgn -site "https://tests.stockfishchess.org/tests/view/66872b0ffe7b81f5e163473c" -event "Batch 295: master vs master" -srand 12345 -resign movecount=3 score=600 -draw movenumber=34 movecount=8 score=20 -variant standard -concurrency 1 -openings file=C:\cutechess-1.3.1-win64\books\UHO_Lichess_4852_v1.epd format=epd order=sequential plies=16 start=52569 -engine  cmd="Stockfish dev-20240709-362a77a3.exe" name=New-ee6fc7e38b4aeef44862159215a56d97122f59a0 tc=0.2+0.002 dir="C:/Chess Engines/" option.Hash=16 -engine name=Base-ee6fc7e38b4aeef44862159215a56d97122f59a0 tc=0.2+0.002 cmd="Stockfish dev-20240709-362a77a3.exe" dir="C:/Chess Engines/" option.Hash=16 -each proto=uci option.Threads=1

fc: 70s cutechess: 57s

fc takes 22% more time, but i suspect that the difference is not linear and decreases as test duration increases

vondele commented 3 months ago

I'll have a look at this tonight as well, but I would right now separate out this performance topic from the output differences.

Disservin commented 3 months ago

should be fixed unless i missed some other precedence

vondele commented 3 months ago

One order still missed.

$ grep "Finished game 9313" *fini*
cutechess-cli-out.finished:Finished game 9313 (sf_1 vs sf_2): 1/2-1/2 {Draw by fifty moves rule}
fast-chess-out.finished:Finished game 9313 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}

the pgn diff also indicates two games with different PlyCount... now harder to find which ones, I'll see if I can...

============== Finished game output ================
9313c9313
< Finished game 9313 (sf_1 vs sf_2): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 9313 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
9981c9981
< Finished game 9981 (sf_1 vs sf_2): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 9981 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
14028c14028
< Finished game 14028 (sf_2 vs sf_1): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 14028 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
29313c29313
< Finished game 29313 (sf_1 vs sf_2): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 29313 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
29981c29981
< Finished game 29981 (sf_1 vs sf_2): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 29981 (sf_1 vs sf_2): 1/2-1/2 {Draw by 3-fold repetition}
34028c34028
< Finished game 34028 (sf_2 vs sf_1): 1/2-1/2 {Draw by fifty moves rule}
---
> Finished game 34028 (sf_2 vs sf_1): 1/2-1/2 {Draw by 3-fold repetition}
============== pgn result ==========================
14987,14988c14987,14988
< 68
< 68
---
> 67
> 67
vondele commented 3 months ago
awk '{if (match($0,"Round") || match($0,"White \"") || match($0, "Black \"")) {printf("%s ",$0)}; if (match($0,"PlyCount")) {printf("%s \n", $0)};}' cutechess-cli-out.pgn | sort > cutechess-cli-out.plycountMore
awk '{if (match($0,"Round") || match($0,"White \"") || match($0, "Black \"")) {printf("%s ",$0)}; if (match($0,"PlyCount")) {printf("%s \n", $0)};}' fast-chess-out.pgn | sort > fast-chess-out.plycountMore

yields:

$ diff fast-chess-out.plycountMore cutechess-cli-out.plycountMore
4862c4862
< [Round "12189"] [White "sf_2"] [Black "sf_1"] [PlyCount "67"] 
---
> [Round "12189"] [White "sf_2"] [Black "sf_1"] [PlyCount "68"] 
22642c22642
< [Round "2189"] [White "sf_2"] [Black "sf_1"] [PlyCount "67"] 
---
> [Round "2189"] [White "sf_2"] [Black "sf_1"] [PlyCount "68"] 

and the corresponding games fc:

[Event "Fast-Chess Tournament"]
[Site "?"]
[Date "2024.07.16"]
[Round "2189"]
[White "sf_2"]
[Black "sf_1"]
[Result "1/2-1/2"]
[SetUp "1"]
[FEN "rnbqk2r/ppppppbp/5np1/8/8/2PPPP2/PP4PP/RNBQKBNR b KQkq - 0 1"]
[GameDuration "00:00:00"]
[GameStartTime "2024-07-16T23:43:27 +0200"]
[GameEndTime "2024-07-16T23:43:28 +0200"]
[PlyCount "67"]
[Termination "adjudication"]
[TimeControl "-"]

1... d6 {+0.67/6 0.001s} 2. d4 {-0.48/8 0.017s} O-O {+0.51/6 0.001s}
3. Bd3 {-0.60/8 0.007s} e5 {+0.81/6 0.001s} 4. Ne2 {-0.58/8 0.005s}
c5 {+0.56/6 0.002s} 5. O-O {-0.77/8 0.031s} Nc6 {+0.47/6 0.005s}
6. dxc5 {-0.73/8 0.005s} e4 {+0.44/6 0.002s} 7. Bxe4 {-0.96/8 0.002s}
Nxe4 {+0.57/6 0.001s} 8. fxe4 {-1.39/8 0.005s} dxc5 {+0.79/6 0.004s}
9. Nf4 {-1.53/8 0.006s} Ne5 {+0.64/6 0.002s} 10. Nd2 {-1.09/8 0.007s}
b6 {+2.24/6 0.001s} 11. c4 {-0.84/8 0.006s} Ba6 {+1.73/6 0.001s}
12. Qe2 {-1.10/8 0.006s} Qc8 {+1.66/6 0.004s} 13. Nd5 {-0.22/8 0.010s}
Qd7 {+1.02/6 0.001s} 14. b3 {+0.44/8 0.004s} Nxc4 {+1.04/6 0.001s}
15. Nxc4 {+0.53/8 0.003s} Bxa1 {+4.06/6 0.000s} 16. Bb2 {+0.91/8 0.000s}
Bxb2 {-0.04/6 0.001s} 17. Qxb2 {+3.85/8 0.004s} f5 {+0.17/6 0.000s}
18. Nf6+ {-0.39/8 0.003s} Rxf6 {+0.71/6 0.000s} 19. Qxf6 {-0.36/8 0.002s}
Rf8 {+2.75/6 0.002s} 20. Qc3 {-0.27/8 0.023s} Qg7 {+0.41/6 0.001s}
21. Qd3 {+0.15/8 0.077s} Bb7 {+0.19/6 0.008s} 22. Nd6 {+0.67/8 0.014s}
Bxe4 {-0.40/6 0.003s} 23. Nxe4 {+0.23/8 0.052s} fxe4 {-0.40/6 0.004s}
24. Rxf8+ {-0.05/8 0.084s} Qxf8 {-0.20/6 0.000s} 25. Qxe4 {-0.02/8 0.004s}
Qd8 {-0.11/6 0.001s} 26. Kf2 {-0.02/8 0.013s} Qd2+ {+0.14/6 0.001s}
27. Kf3 {-0.01/8 0.006s} Kf7 {+0.00/6 0.005s} 28. Qb7+ {+0.05/8 0.031s}
Ke6 {+0.09/6 0.002s} 29. Qe4+ {+0.00/8 0.020s} Kd7 {+0.00/6 0.000s}
30. Qa4+ {-0.01/8 0.008s} Ke6 {-0.09/6 0.001s} 31. Qe8+ {+0.02/8 0.019s}
Kf6 {-0.17/6 0.003s} 32. Qh8+ {+0.02/8 0.002s} Ke6 {+0.20/6 0.000s}
33. Qe8+ {+0.00/8 0.002s} Kf6 {+0.00/6 0.001s} 34. Qf8+ {+0.00/8 0.000s}
Ke6 {-0.02/6 0.001s Draw by adjudication} 1/2-1/2

cc

[Event "?"]
[Site "?"]
[Date "2024.07.16"]
[Round "2189"]
[White "sf_2"]
[Black "sf_1"]
[Result "1/2-1/2"]
[BlackTimeControl "inf"]
[FEN "rnbqk2r/ppppppbp/5np1/8/8/2PPPP2/PP4PP/RNBQKBNR b KQkq - 0 1"]
[GameDuration "00:00:00"]
[GameEndTime "2024-07-16T23:20:22.149 CEST"]
[GameStartTime "2024-07-16T23:20:21.224 CEST"]
[PlyCount "68"]
[SetUp "1"]
[Termination "adjudication"]
[WhiteTimeControl "inf"]

1... d6 {+0.67/6 0.007s} 2. d4 {-0.48/8 0.022s} O-O {+0.51/6 0.005s}
3. Bd3 {-0.60/8 0.016s} e5 {+0.81/6 0.007s} 4. Ne2 {-0.58/8 0.022s}
c5 {+0.56/6 0.011s} 5. O-O {-0.77/8 0.023s} Nc6 {+0.47/6 0.011s}
6. dxc5 {-0.73/8 0.019s} e4 {+0.44/6 0.006s} 7. Bxe4 {-0.96/8 0.018s}
Nxe4 {+0.57/6 0.004s} 8. fxe4 {-1.39/8 0.016s} dxc5 {+0.79/6 0.004s}
9. Nf4 {-1.53/8 0.012s} Ne5 {+0.64/6 0.006s} 10. Nd2 {-1.09/8 0.009s}
b6 {+2.24/6 0.005s} 11. c4 {-0.84/8 0.011s} Ba6 {+1.73/6 0.005s}
12. Qe2 {-1.10/8 0.008s} Qc8 {+1.66/6 0.005s} 13. Nd5 {-0.22/8 0.008s}
Qd7 {+1.02/6 0.005s} 14. b3 {+0.44/8 0.008s} Nxc4 {+1.04/6 0.004s}
15. Nxc4 {+0.53/8 0.014s} Bxa1 {+4.06/6 0.004s} 16. Bb2 {+0.91/8 0.006s}
Bxb2 {-0.04/6 0.010s} 17. Qxb2 {+3.85/8 0.019s} f5 {+0.17/6 0.005s}
18. Nf6+ {-0.39/8 0.010s} Rxf6 {+0.71/6 0.007s} 19. Qxf6 {-0.36/8 0.013s}
Rf8 {+2.75/6 0.005s} 20. Qc3 {-0.27/8 0.017s} Qg7 {+0.41/6 0.006s}
21. Qd3 {+0.15/8 0.014s} Bb7 {+0.19/6 0.008s} 22. Nd6 {+0.67/8 0.007s}
Bxe4 {-0.40/6 0.012s} 23. Nxe4 {+0.23/8 0.027s} fxe4 {-0.40/6 0.018s}
24. Rxf8+ {-0.05/8 0.029s} Qxf8 {-0.20/6 0.005s} 25. Qxe4 {-0.02/8 0.021s}
Qd8 {-0.11/6 0.007s} 26. Kf2 {-0.02/8 0.009s} Qd2+ {+0.14/6 0.010s}
27. Kf3 {-0.01/8 0.013s} Kf7 {0.00/6 0.007s} 28. Qb7+ {+0.05/8 0.015s}
Ke6 {+0.09/6 0.007s} 29. Qe4+ {0.00/8 0.013s} Kd7 {0.00/6 0.007s}
30. Qa4+ {-0.01/8 0.016s} Ke6 {-0.09/6 0.004s} 31. Qe8+ {+0.02/8 0.014s}
Kf6 {-0.17/6 0.009s} 32. Qh8+ {+0.02/8 0.020s} Ke6 {+0.20/6 0.005s}
33. Qe8+ {0.00/8 0.016s} Kf6 {0.00/6 0.007s} 34. Qf8+ {0.00/8 0.013s}
Ke6 {-0.02/6 0.006s} 35. Qg8+ {0.00/8 0.021s, Draw by adjudication} 1/2-1/2
vondele commented 3 months ago

-draw movenumber=34 movecount=8 score=20 seems suspiciously close to movenumber 34.

I'll check the other game. Same also move number 34.

[Event "Fast-Chess Tournament"]
[Site "?"]
[Date "2024.07.16"]
[Round "12189"]
[White "sf_2"]
[Black "sf_1"]
[Result "1/2-1/2"]
[SetUp "1"]
[FEN "rnbqk2r/ppppppbp/5np1/8/8/2PPPP2/PP4PP/RNBQKBNR b KQkq - 0 1"]
[GameDuration "00:00:00"]
[GameStartTime "2024-07-16T23:51:06 +0200"]
[GameEndTime "2024-07-16T23:51:07 +0200"]
[PlyCount "67"]
[Termination "adjudication"]
[TimeControl "-"]

1... d6 {+0.67/6 0.002s} 2. d4 {-0.48/8 0.026s} O-O {+0.51/6 0.004s}
3. Bd3 {-0.60/8 0.008s} e5 {+0.81/6 0.001s} 4. Ne2 {-0.58/8 0.006s}
c5 {+0.56/6 0.002s} 5. O-O {-0.77/8 0.075s} Nc6 {+0.47/6 0.004s}
6. dxc5 {-0.73/8 0.020s} e4 {+0.44/6 0.007s} 7. Bxe4 {-0.96/8 0.023s}
Nxe4 {+0.57/6 0.003s} 8. fxe4 {-1.39/8 0.018s} dxc5 {+0.79/6 0.001s}
9. Nf4 {-1.53/8 0.024s} Ne5 {+0.64/6 0.004s} 10. Nd2 {-1.09/8 0.014s}
b6 {+2.24/6 0.000s} 11. c4 {-0.84/8 0.002s} Ba6 {+1.73/6 0.001s}
12. Qe2 {-1.10/8 0.004s} Qc8 {+1.66/6 0.001s} 13. Nd5 {-0.22/8 0.008s}
Qd7 {+1.02/6 0.002s} 14. b3 {+0.44/8 0.004s} Nxc4 {+1.04/6 0.003s}
15. Nxc4 {+0.53/8 0.008s} Bxa1 {+4.06/6 0.001s} 16. Bb2 {+0.91/8 0.001s}
Bxb2 {-0.04/6 0.002s} 17. Qxb2 {+3.85/8 0.003s} f5 {+0.17/6 0.001s}
18. Nf6+ {-0.39/8 0.004s} Rxf6 {+0.71/6 0.000s} 19. Qxf6 {-0.36/8 0.003s}
Rf8 {+2.75/6 0.001s} 20. Qc3 {-0.27/8 0.008s} Qg7 {+0.41/6 0.001s}
21. Qd3 {+0.15/8 0.014s} Bb7 {+0.19/6 0.002s} 22. Nd6 {+0.67/8 0.003s}
Bxe4 {-0.40/6 0.003s} 23. Nxe4 {+0.23/8 0.024s} fxe4 {-0.40/6 0.001s}
24. Rxf8+ {-0.05/8 0.038s} Qxf8 {-0.20/6 0.000s} 25. Qxe4 {-0.02/8 0.007s}
Qd8 {-0.11/6 0.001s} 26. Kf2 {-0.02/8 0.009s} Qd2+ {+0.14/6 0.002s}
27. Kf3 {-0.01/8 0.025s} Kf7 {+0.00/6 0.019s} 28. Qb7+ {+0.05/8 0.026s}
Ke6 {+0.09/6 0.002s} 29. Qe4+ {+0.00/8 0.013s} Kd7 {+0.00/6 0.001s}
30. Qa4+ {-0.01/8 0.006s} Ke6 {-0.09/6 0.001s} 31. Qe8+ {+0.02/8 0.018s}
Kf6 {-0.17/6 0.002s} 32. Qh8+ {+0.02/8 0.001s} Ke6 {+0.20/6 0.000s}
33. Qe8+ {+0.00/8 0.002s} Kf6 {+0.00/6 0.001s} 34. Qf8+ {+0.00/8 0.000s}
Ke6 {-0.02/6 0.000s Draw by adjudication} 1/2-1/2

Edit: I see I cycle twice through the book, it is the same FEN.