Disservin / chess-library

C++ chess library
https://disservin.github.io/chess-library/
MIT License
71 stars 22 forks source link

fix two PGN parsing bugs #36

Closed robertnurnberg closed 10 months ago

robertnurnberg commented 10 months ago

One bug was discovered by @vondele and relates to '\n' not finishing the reading of a move.

The other bug relates to 0-0 not always being recognized as a castling move, because 0 is not a letter. I fixed this with a bit of a hack: we now check also for - to start a move, and then retrospectively add the missing first character to the move before we send it back.

Some testing would be good.

robertnurnberg commented 10 months ago

Just realized, this may lead to a crash when trying to access move[1] for corrupted pgns with things like 4. - - 5.. Let me know if I should fix this by adding a test on length of move.

I have pushed a fix for this now. Better safe than sorry, and the extra check is only performed for legal moves starting with 0-.

Disservin commented 10 months ago

Thanks ;) I think at some point I anyway need to slightly refactor that code so also be able to parse variations and a better way to skip the movenumbers

robertnurnberg commented 10 months ago

yw :)

You may want to keep pgn examples with line breaks in move list and also with 0-0 castling moves for non-functionality checks for future refactorings. :)

Disservin commented 10 months ago

currently writing test for those and just realized this also ignores the last move

[Event "Batch 10: s20red4c4_t3 vs master"]
[Site "https://tests.stockfishchess.org/tests/view/62df67e48e4fa6ae47266770"]
[Date "2022.07.26"]
[Round "1"]
[White "New-cfe8ce842c"]
[Black "Base-c4a644922d"]
[Result "1/2-1/2"]
[FEN "rn1qk2r/pbpp2pp/4pn2/3P1p2/1bP5/2N2NP1/1P2PP1P/R1BQKB1R w KQkq - 0 1"]
[GameDuration "00:00:34"]
[GameEndTime "2022-07-26T12:12:29.970 CST"]
[GameStartTime "2022-07-26T12:11:55.040 CST"]
[PlyCount "130"]
[SetUp "1"]
[TimeControl "10.931+0.109"]

1. Bg2 O-O 2. O-O
a5 3. Nd4 Na6
Disservin commented 10 months ago

and meh, just realized that because of this it will also try to add 1/2-1/2 as a move (the move would actually be 1-1/2)

robertnurnberg commented 10 months ago

Oh, sorry. Do you have an example pgn where this causes a problem?

Disservin commented 10 months ago
[Event "Batch 10: s20red4c4_t3 vs master"]
[Site "https://tests.stockfishchess.org/tests/view/62df67e48e4fa6ae47266770"]
[Date "2022.07.26"]
[Round "1"]
[White "New-cfe8ce842c"]
[Black "Base-c4a644922d"]
[Result "1/2-1/2"]
[FEN "rn1qk2r/pbpp2pp/4pn2/3P1p2/1bP5/2N2NP1/1P2PP1P/R1BQKB1R w KQkq - 0 1"]
[GameDuration "00:00:34"]
[GameEndTime "2022-07-26T12:12:29.970 CST"]
[GameStartTime "2022-07-26T12:11:55.040 CST"]
[PlyCount "130"]
[SetUp "1"]
[TimeControl "10.931+0.109"]

1. Bg2 {+1.55/16 0.70s} O-O {-1.36/18 0.78s} 2. O-O {+1.84/16 0.42s}
a5 {-1.30/16 0.16s} 3. Nd4 {+1.53/18 0.83s} Na6 {-1.36/16 0.23s}
4. Qc2 {+1.78/16 0.24s} Qe8 {-1.33/16 0.54s} 5. Bd2 {+1.64/20 1.6s}
h6 {-1.51/16 0.75s} 6. Nb3 {+1.76/19 0.88s} exd5 {-1.46/15 0.17s}
7. Nxa5 {+2.07/17 0.34s} Bxa5 {-1.37/18 0.22s} 8. Rxa5 {+2.00/17 0.17s}
Nb4 {-1.50/17 0.42s} 9. Qa4 {+1.83/17 0.23s} c5 {-1.75/17 0.31s}
10. cxd5 {+1.95/16 0.17s} d6 {-1.55/16 0.25s} 11. Rxa8 {+1.87/18 0.29s}
Bxa8 {-1.55/16 0.19s} 12. Qb3 {+1.91/15 0.20s} Qf7 {-1.39/15 0.20s}
13. Bf4 {+1.87/16 0.43s} Rd8 {-1.27/14 0.29s} 14. Rd1 {+1.75/15 0.19s}
g5 {-1.44/15 0.42s} 15. Bd2 {+1.88/17 0.57s} Kg7 {-1.45/15 0.41s}
16. Qa4 {+1.86/17 1.3s} Bb7 {-1.44/14 0.23s} 17. e4 {+1.59/19 1.9s}
Ra8 {-1.50/24 1.9s} 18. Qb5 {+1.96/15 0.10s} fxe4 {-1.27/18 0.30s}
19. Nxe4 {+1.76/17 0.30s} Qxd5 {-1.38/17 0.20s} 20. Qe2 {+1.87/16 0.20s}
Nxe4 {-1.50/18 0.27s} 21. Bxb4 {+1.84/16 0.25s} Qf5 {-1.36/18 0.28s}
22. g4 {+1.76/15 0.21s} Qe5 {-1.49/16 0.24s} 23. Be1 {+1.70/17 0.43s}
Bc6 {-1.05/15 0.21s} 24. f3 {+2.34/17 0.51s} Ba4 {-1.32/17 0.53s}
25. Rc1 {+1.96/18 0.35s} Qd4+ {-1.66/21 1.2s} 26. Kh1 {+1.80/15 0.33s}
Nf6 {-1.71/16 0.17s} 27. Bc3 {+1.95/18 0.55s} Qf4 {-1.74/18 0.19s}
28. Qe7+ {+1.87/14 0.18s} Kg8 {-1.49/18 0.26s} 29. Ra1 {+1.87/11 0.033s}
Rf8 {-1.44/16 0.24s} 30. Qe6+ {+1.79/12 0.055s} Kg7 {-1.43/16 0.39s}
31. h3 {+2.04/15 0.21s} Rf7 {-1.58/19 0.41s} 32. Re1 {+1.94/13 0.033s}
Bd7 {-1.45/17 0.17s} 33. Qd5 {+1.86/14 0.092s} Kf8 {-1.49/17 0.17s}
34. Qa8+ {+2.37/12 0.037s} Ne8 {-1.46/17 0.20s} 35. Qd8 {+2.57/13 0.043s}
Bc6 {-1.56/17 0.27s} 36. Kg1 {+2.37/15 0.12s} Qg3 {-1.67/16 0.42s}
37. Re6 {+2.48/14 0.064s} Ra7 {-1.25/16 0.14s} 38. Be1 {+2.38/14 0.13s}
Qf4 {-1.35/17 0.17s} 39. Bc3 {+2.00/16 0.36s} Qg3 {-1.43/17 0.28s}
40. Be1 {+2.19/14 0.087s} Qf4 {-0.97/17 0.14s} 41. Qb6 {+2.35/12 0.035s}
Qf7 {-0.07/16 0.15s} 42. Re2 {+0.27/15 0.24s} Qb7 {0.00/21 0.32s}
43. Qd8 {+0.09/12 0.062s} Qc7 {0.00/19 0.087s} 44. Qxc7 {+0.06/15 0.11s}
Nxc7 {0.00/15 0.090s} 45. Bg3 {+0.05/14 0.024s} Ra1+ {0.00/16 0.10s}
46. Kh2 {0.00/15 0.097s} Rd1 {0.00/20 0.19s} 47. f4 {0.00/17 0.026s}
Bxg2 {+0.12/19 0.14s} 48. Kxg2 {0.00/18 0.056s} Kf7 {0.00/18 0.21s}
49. f5 {0.00/17 0.074s} Rd3 {0.00/20 0.23s} 50. Kf2 {0.00/20 0.14s}
Nb5 {0.00/20 0.055s} 51. Re6 {+0.07/18 0.073s} h5 {+0.13/15 0.12s}
52. gxh5 {+0.09/13 0.062s} Nd4 {+0.06/14 0.15s} 53. Rg6 {+0.07/14 0.10s}
Nxf5 {+0.06/15 0.087s} 54. Rxg5 {0.00/15 0.053s} Nxg3 {0.00/15 0.11s}
55. Rxg3 {0.00/15 0.12s} Rxg3 {+0.04/21 0.17s} 56. Kxg3 {0.00/23 0.078s}
d5 {0.00/21 0.055s} 57. Kf4 {0.00/27 0.095s} Kg7 {+0.01/23 0.10s}
58. Ke5 {0.00/27 0.093s} d4 {0.00/27 0.085s} 59. Ke4 {0.00/29 0.097s}
Kh6 {+0.01/28 0.097s} 60. b4 {0.00/31 0.11s} cxb4 {0.00/31 0.11s}
61. Kxd4 {0.00/35 0.11s} b3 {0.00/33 0.092s} 62. Kc3 {0.00/35 0.094s}
b2 {0.00/34 0.075s} 63. Kxb2 {0.00/37 0.078s} Kxh5 {0.00/35 0.084s}
64. Kb3 {0.00/37 0.088s} Kh4 {0.00/39 0.083s} 65. Kb4 {0.00/39 0.098s}
Kxh3 {0.00/40 0.099s, Draw by insufficient mating material} 1/2-1/2
Disservin commented 10 months ago

I just pushed one fix which made sure that the move() function is always called for the last move, which would happen for a pgn like this

1. Bg2 O-O 2. O-O
a5 3. Nd4 Na6

and also added some more unit tests (can be run via .\out.exe -ts="PGN StreamParser")

Disservin commented 10 months ago

marked the test which fails with @TODO fix 1/2-1/2 being interpreted as a move and if you just add std::cout << move << " comment:" << comment << std::endl; into the move function for the visitor you can see it being printed.

robertnurnberg commented 10 months ago

Ok, so this needs to be fixed before the version can be bumped in wdl repo and @vondele 's fastpopular. agreed?

Disservin commented 10 months ago

i think so

robertnurnberg commented 10 months ago

I think this would be a nontrivial change. You could introduce a new state read_move_number or something. A design decision, more or less. So I will leave that to you. :)

robertnurnberg commented 10 months ago

Actually, the easiest fix might be to store the previous character in a variable, and then only act on a hyphen if the previous character was a 0. (O are already caught by the notaletter logic.) I might cook something up tomorrow.

Disservin commented 10 months ago

What about 0-1 ?