niklasf / python-chess

A chess library for Python, with move generation and validation, PGN parsing and writing, Polyglot opening book reading, Gaviota tablebase probing, Syzygy tablebase probing, and UCI/XBoard engine communication
https://python-chess.readthedocs.io/en/latest/
GNU General Public License v3.0
2.46k stars 531 forks source link

Minor tweak suggestions ... #365

Closed ChrisWhittington closed 5 years ago

ChrisWhittington commented 5 years ago
niklasf commented 5 years ago

Thanks for the suggestions. I took the liberty to number the items.

(1) Adding count() would kindof pretend that we can do an efficient count, however the best this library can do is basically sum(1 for _ in iter(lambda: chess.pgn.skip_game(pgn), None)).

(2) That sounds like a bug. Please share a PGN that causes it.

(3) board.transform(chess.flip_horizontal)

(4) It should not (other than captures and en-passant, of course). Please share an example.

(5) See #361, but very likely out of scope for the library.

ChrisWhittington commented 5 years ago

Thanks for quick response.

I can try to find the crashing PGN, but it's somewhere in the middle of Rebel's 2800 plus fraction of what are basically a mass of CCRL and CEGT games. 1.7 million games. Many of which contain illegal moves, somehow. A cleaned up PGN database works fine.

If I had a way to suppress warnings, I could run my way through that database, printing the pgns to console until crash back to system, but with a very high verbose Python warning count filling the screen, that's not really doable.

Will see if I can find a (4)

ChrisWhittington commented 5 years ago
niklasf commented 5 years ago

An example for (6) would also help. It should always be returned as a list of moves (probably most of the time with 1 element).

(2) A quick and dirty way would be import logging; logging.basicConfig(level=logging.CRITICAL). Or chess.pgn.GameCreator could be subclassed to overwrite handle_error.

ChrisWhittington commented 5 years ago

some epds have the move before the delimiter, and some after. some use comma and some semi colon to delimit. some have bm prefix and some not etc. some have a list of moves, some have multiple moves but with commenting in between probably mostof the decoding of weird formats is best left to the programmer, my suggestion was just to be able to parse the move itself.

some epds in e2e4 format:

1rr3k1/2q2pbp/1p1p2p1/n2Pp3/R1P1P3/2NQ2P1/5P1P/2R2BK1 b - - 0 1, bm a5b3 , id lc0 test suitea5b3 Score 1/1 = 100.00%
2rr2k1/1p1bppbp/pBqp1np1/4n3/N1P1P3/1P3P2/P1NQB1PP/2RR2K1 w - - 0 1, bm c2b4 , id lc0 test suitec2b4 Score 2/2 = 100.00%
2B1k2r/4bp1p/pq1p1p2/1pn1p3/4P3/4N3/PP1N1PPP/1K1R1R2 b - - 0 1, bm c5e6 , id lc0 test suitea6a5 Score 2/3 = 66.67%
8/8/8/7p/1k4p1/1P6/2K2P1P/8 w - - 0 1, bm c2b2 , id lc0 test suitec2b2 Score 3/4 = 75.00%
1r3rk1/2q5/3p1pp1/n1pPp1n1/ppP1P3/1P4N1/P1BQ1PKR/7R w - - 0 1, bm d2g5 , id lc0 test suitef2f4 Score 3/5 = 60.00%

and some in Nf3 format

1rr3k1/2q2pbp/1p1p2p1/n2Pp3/R1P1P3/2NQ2P1/5P1P/2R2BK1 b - - 0 1, bm a5b3 , id lc0 test suitea5b3 Score 1/1 = 100.00%
2rr2k1/1p1bppbp/pBqp1np1/4n3/N1P1P3/1P3P2/P1NQB1PP/2RR2K1 w - - 0 1, bm c2b4 , id lc0 test suitec2b4 Score 2/2 = 100.00%
2B1k2r/4bp1p/pq1p1p2/1pn1p3/4P3/4N3/PP1N1PPP/1K1R1R2 b - - 0 1, bm c5e6 , id lc0 test suitea6a5 Score 2/3 = 66.67%
8/8/8/7p/1k4p1/1P6/2K2P1P/8 w - - 0 1, bm c2b2 , id lc0 test suitec2b2 Score 3/4 = 75.00%
ChrisWhittington commented 5 years ago

(4), just realised what it is about is_capture() is not really a bug, more a user-problem(!), documentation to say: 'these tests require an unmoved board status' ...?

this works ...

for move in game.mainline_moves(): if board.is_capture(move)):

this doesn't work

for move in game.mainline_moves(): board.push(move) if board.is_capture(move)):

ChrisWhittington commented 5 years ago

Below are some examples of PGNs that went wrong, but not enough wrong to crash.

PGN parser finds an illegal move, gives a warning, then continues parsing until it finds a (random) legal move which it then adds in at the previous fail point and continues parsing for more. If you then save the resulting PGN, it consists of the original up to the illegal move fail, then whatever moves it manages to add. So PGN != original PGN

I would guess that "wrong enough to crash" is probably after several move fails building up.

Maybe stop parsing at one fail, and skip on to next PGN in file, with original faulty PGN as text dumped somewhere. Would allow a) manual inspection of failure and b) not fail the entire run

Here some examples of truncated and falsely appended PGN because of illegal moves. eg, these games don't actually exist

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2006.09.27"]
[Round "1"]
[White "Zap!Chess Paderborn 64-bit 4CPU"]
[Black "Loop 10.32f"]
[Result "1-0"]
[BlackElo "2812"]
[WhiteElo "2838"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. Be2 e6 7. O-O Be7 8. f4
O-O 9. Kh1 Qc7 10. Be3 Nc6 11. a4 Re8 12. Bf3 Rb8 13. Qd2 Na5 14. b3 e5 15.
Nde2 Bd7 16. fxe5 dxe5 17. Nd5 Nxd5 18. exd5 g6 19. Ng3 f5 20. d6 Bxd6 21. Rad1
Re6 22. Bd5 Be8 23. Bh6 Bf7 24. Kg1 Nc6 25. c3 1-0

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2006.10.23"]
[Round "1"]
[White "Rybka 1.2 32-bit"]
[Black "Rybka 1.1 32-bit"]
[Result "0-1"]
[BlackElo "2853"]
[WhiteElo "2863"]

1. e4 c5 2. Nf3 Nc6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 d6 6. Bg5 e6 7. Qd2 a6 8.
O-O-O h6 9. Be3 Be7 10. f4 Nxd4 11. Bxd4 b5 12. Kb1 Bb7 13. e5 dxe5 14. fxe5
Nd5 15. Ne4 Rc8 16. Bd3 O-O 17. c3 Nb4 18. cxb4 Qxd4 19. Nd6 Qd5 20. Nxc8 Rxc8
21. Rhe1 Bg5 22. Qf2 Qxg2 23. Qxg2 Bxg2 24. Be4 Bh3 25. Rd3 Bg4 26. h3 Bh5 27.
Rd6 Rc4 28. a3 Rxe4 29. Rxe4 Bg6 30. Ka2 Bxe4 31. Kb3 h5 32. a4 a5 0-1

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2006.10.23"]
[Round "1"]
[White "Rybka 2.1 32-bit"]
[Black "Rybka 1.1 32-bit"]
[Result "0-1"]
[BlackElo "2853"]
[WhiteElo "2856"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. Nf3 O-O 6. Be2 e5 7. O-O Nc6 8. d5
Ne7 9. b4 a5 10. Ba3 axb4 11. Bxb4 Nd7 12. a4 Bh6 13. h3 f5 14. Bd3 Bf4 15. h4
Nf6 16. g3 fxe4 17. Nxe4 Nxe4 18. Bxe4 Bh6 19. c5 Bf5 20. Re1 Bxe4 21. Rxe4 Nf5
22. Qe2 Qd7 23. a5 Rf7 24. cxd6 cxd6 25. h5 Raf8 26. hxg6 hxg6 27. Qd3 Ne7 28.
Nxe5 dxe5 29. Rxe5 Rxf2 30. Rxe7 Qg4 31. Bd6 R2f3 32. Qe4 Be3+ 33. Qxe3 Rxe3
34. Rxe3 Qd4 35. Be5 Qxd5 36. Bc3 Qc5 37. Kg2 g5 38. Re8 g4 39. Re3 Qf5 40. Rc1
Qb1 41. Kg1 Qb4 42. Kg2 Qe4+ 43. Kg1 Qc2 44. Bd4 Qb3 45. a6 Qb1 46. Kh2 bxa6
47. Ba7 Qa2+ 0-1

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2006.10.25"]
[Round "1"]
[White "Rybka 1.1 32-bit"]
[Black "Rybka 2.1 32-bit"]
[Result "0-1"]
[BlackElo "2856"]
[WhiteElo "2853"]

1. d4 Nf6 2. c4 e6 3. Nc3 Bb4 4. Qc2 O-O 5. a3 Bxc3+ 6. Qxc3 b6 7. Bg5 Bb7 8.
f3 d6 9. e3 Nbd7 10. Bd3 c5 11. Ne2 Rc8 12. O-O Ba6 13. b3 h6 14. Bf4 cxd4 15.
exd4 Nd5 16. Qd2 Nxf4 17. Qxf4 d5 18. Rac1 dxc4 19. Bxc4 Bxc4 20. bxc4 e5 21.
dxe5 Qe7 22. Rfe1 Nxe5 23. Nd4 Rfe8 24. Re4 f6 25. Qe3 Kh7 26. Rc3 Qd6 27. Qf2
Re7 28. Qc2 g6 29. Nb5 Qc5+ 30. Qf2 Rd7 31. Qxc5 Rxc5 32. f4 Nxc4 33. Rexc4
Rxb5 34. Rc7 Kg8 35. Rxd7 f5 36. Rc6 a5 37. h4 Rb4 38. a4 0-1

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2007.01.12"]
[Round "1"]
[White "Deep Fritz 10 4CPU"]
[Black "Deep Shredder 10 64-bit 4CPU"]
[Result "1-0"]
[BlackElo "2846"]
[WhiteElo "2840"]

1. d4 e6 2. e4 d5 3. Nc3 Bb4 4. e5 c5 5. a3 Bxc3+ 6. bxc3 Qc7 7. Nf3 Ne7 8. a4
b6 9. Bb5+ Bd7 10. Bd3 Nbc6 11. Ba3 O-O 12. O-O Rfe8 13. Ng5 h6 14. Qh5 Nxe5
15. dxe5 hxg5 16. Qh7+ Kf8 17. Qh8+ Ng8 18. f4 gxf4 19. Bc1 Ke7 20. Qh4+ f6 21.
Bxf4 Qb7 22. Bg6 Bc6 23. Qh8 f5 24. g4 Kd7 25. gxf5 d4 26. f6 gxf6 27. exf6
Nxf6 28. Qxf6 Bd5 29. Bg3 Qc6 30. Qg7+ Re7 31. Rf7 1-0

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2007.02.18"]
[Round "1"]
[White "Naum 2.1 64-bit 4CPU"]
[Black "Deep Shredder 10 64-bit 4CPU"]
[Result "0-1"]
[BlackElo "2846"]
[WhiteElo "2844"]

1. Nf3 d5 2. d4 Nf6 3. c4 e6 4. Nc3 c6 5. e3 Nbd7 6. Qc2 Bd6 7. Bd3 O-O 8. O-O
dxc4 9. Bxc4 b5 10. Be2 Bb7 11. e4 e5 12. Rd1 Qc7 13. g3 Rfe8 14. a3 a5 15.
dxe5 Nxe5 16. Nd4 Bc5 17. Nb3 Bb6 18. Bf4 Qc8 19. Nd4 Qh3 20. f3 Nh5 21. Bxe5
Rxe5 22. Bf1 Qd7 23. Qd3 Qe7 24. f4 Nf6 25. h3 Nxe4 26. Nxb5 Nd2 27. Kf2 Re4
28. Bg2 Re2+ 0-1

[Event "CCRL 40/4"]
[Site "CCRL"]
[Date "2007.02.17"]
[Round "1"]
[White "Naum 2.1 64-bit 4CPU"]
[Black "Deep Fritz 10 4CPU"]
[Result "0-1"]
[BlackElo "2840"]
[WhiteElo "2844"]

1. e4 e6 2. d4 d5 3. Nc3 Bb4 4. e5 c5 5. a3 Bxc3+ 6. bxc3 Ne7 7. Qg4 O-O 8. Bd3
Nbc6 9. Qh5 Ng6 10. Nf3 Qc7 11. O-O c4 12. Be2 f6 13. exf6 Rxf6 14. g3 Bd7 15.
Re1 Raf8 16. Rb1 e5 17. dxe5 Ncxe5 18. Nxe5 Nxe5 19. Be3 b5 20. Qh4 Bc6 21. Bd4
Be8 22. Qh3 Nf3+ 23. Bxf3 Rxf3 24. Qh4 R3f7 25. Re5 Bd7 26. Rbe1 Qc6 27. Bxa7
Bf5 28. Qd4 Be4 29. f4 Ra8 30. Bc5 Rc8 31. Bb4 Qg6 32. Re2 Kh8 33. Rf2 Rf5 34.
Bd6 Rg8 35. Re7 h5 36. Be5 h4 37. Qe3 Rh5 38. Rd2 Rh7 39. Ra7 Qf5 40. Re7 Rh6
41. Qa7 Rg6 42. Qd7 Qh5 43. Qh3 Ra8 44. f5 Bxf5 45. Qg2 Qg5 46. Qxd5 Rf8 47.
Rxg7 Rxg7 48. Qd4 Qd8 49. Qxd8 Rxd8 50. Rd5 Bxc2 51. Rxb5 h3 52. Rb2 Bb3 53.
Kf2 Rd2+ 0-1
ChrisWhittington commented 5 years ago

(1) sum(1 for _ in iter(lambda: chess.pgn.skip_game(pgn), None)) takes about 30 secs for a 1.7 million game database, and takes you to end of database, so it needs to be reloaded. Quite probably needs to be left to the user.

niklasf commented 5 years ago

(1) Yeah ... nothing I can do in general, when given only a stream. pgn.seek(0) can be used to rewind without reopening.

(7) I need this quite frequently myself, but never got around to do it. Now implemented in 65e872995c6380522d3839e60a7235f06899fced. Defaults to 0 1.

Will look into the examples now ...

niklasf commented 5 years ago

(6) First of all the examples are not valid EPDs (seperator should be ;). Anyway: The issue appears to be that moves are not in SAN notation. I believe that's invalid, strictly speaking (https://www.chessprogramming.org/Extended_Position_Description#Operations), but maybe it could be supported nonetheless.

(2) Mhh ... I guess it makes sense to skip to the end of the game, after the first error. Will consider doing that. It doesn't really explain the full crash, though. If only legal moves are picked, it should be impossible to get into an invalid state.

niklasf commented 5 years ago

(2) Now skipping variations with illegal moves (1336012da9ded50965b1ebed2f1d7cdf2a49b30e).

niklasf commented 5 years ago

(6) ... and more relaxed handling of moves in EPDs (e9738d8899d10c137de3611262edb91c6e73254f).