Vast342 / Clarity

UCI chess engine with NNUE evaluation
GNU General Public License v3.0
43 stars 5 forks source link

v5.0.0 illegal move #46

Closed tissatussa closed 9 months ago

tissatussa commented 9 months ago

in a 900+3 game Clarity v5.0.0 lost without a reason, while it was simply winning : it couldn't mate with just one Rook !?

clarity-5-0-0-illegal-move

here's the game :

[Event "engine vs engine"] [Site "Holland"] [Date "2024.02.17"] [Round "?"] [White "Winter v2.04b NN"] [Black "Clarity v5.0.0 NNUE"] [Result "1/2-1/2"] [Termination "illegal move"] [TimeControl "900+3"] [Opening "Sicilian"] [ECO "B23"] [Variation "Closed, 2...Nc6"]

  1. e4 {+0.17/24 43s} c5 {-0.10/31 45s} 2. Nc3 {+0.21/24 49s} Nc6 {-0.08/33 51s} 3. Nf3 {+0.23/22 47s} e5 {-0.20/35 55s} 4. Bc4 {+0.24/22 26s} d6 {-0.23/34 48s} 5. O-O {+0.24/25 31s} Be7 {-0.15/34 33s} 6. Nd5 {+0.22/22 42s} Nf6 {-0.19/35 38s} 7. Re1 {+0.18/21 35s} Nxd5 {-0.14/34 40s} 8. exd5 {+0.25/21 21s} Nd4 {-0.08/33 24s} 9. Nxd4 {+0.16/23 37s} cxd4 {-0.08/30 19s} 10. a4 {+0.14/22 23s} Qc7 {0.00/32 19s} 11. b3 {+0.16/21 24s} O-O {0.00/34 21s} 12. c3 {+0.27/21 28s} dxc3 {0.00/35 67s} 13. dxc3 {+0.18/20 18s} Bd7 {-0.06/30 16s} 14. Bd2 {+0.23/21 18s} Bf6 {0.00/35 82s} 15. Bb5 {+0.31/20 22s} Bd8 {-0.14/32 16s} 16. Bxd7 {+0.23/22 28s} Qxd7 {-0.11/32 13s} 17. a5 {+0.20/22 17s} Rc8 {-0.16/31 23s} 18. c4 {+0.29/20 27s} f5 {-0.24/31 19s} 19. h3 {+0.34/21 14s} h6 {-0.11/26 15s} 20. Ra2 {+0.31/19 25s} Bh4 {-0.18/28 23s} 21. Qh5 {+0.33/19 24s} Be7 {-0.18/29 19s} 22. Qg6 {+0.44/21 19s} Rf6 {0.00/40 82s} 23. Qg3 {+0.28/20 11s} Kh7 {-0.13/30 13s} 24. Qd3 {+0.29/20 22s} Rg6 {0.00/32 7.2s} 25. b4 {+0.18/19 13s} Bh4 {+0.02/28 11s} 26. Rc2 {+0.34/20 18s} e4 {-0.14/27 17s} 27. Qb3 {+0.45/19 19s} b6 {-0.01/28 17s} 28. axb6 {+0.30/18 10s} axb6 {-0.16/25 5.4s} 29. Bf4 {+0.24/18 18s} Re8 {0.00/30 9.8s} 30. Ra2 {+0.27/17 12s} Bg5 {0.00/26 7.5s} 31. Bxg5 {+0.24/18 17s} Rxg5 {0.00/24 6.3s} 32. Rae2 {+0.22/18 16s} Qf7 {0.00/29 7.4s} 33. f4 {+0.12/19 11s} Rg6 {0.00/31 16s} 34. Rc2 {+0.12/21 15s} Ra8 {+0.31/25 5.9s} 35. Kh2 {0.00/24 8.6s} Qe7 {+0.15/30 10s} 36. Rg1 {0.00/23 8.0s} Qa7 {+0.29/25 5.3s} 37. Qc3 {0.00/25 7.2s} Qa3 {+0.20/25 4.6s} 38. g4 {-0.16/21 9.6s} Rf6 {+0.19/26 5.1s} 39. Re1 {-0.14/21 8.5s} Kg6 {+0.34/25 6.3s} 40. Qd4 {-0.33/20 13s} Qd3 {+0.43/24 6.2s} 41. Rd2 {-0.57/20 8.9s} Qxd4 {+0.34/21 4.5s} 42. Rxd4 {-0.57/22 11s} Ra2+ {+0.29/24 4.7s} 43. Kg3 {-0.40/22 11s} Ra3+ {+0.28/21 4.4s} 44. Kg2 {-0.42/22 11s} h5 {+0.62/22 4.4s} 45. gxf5+ {-0.34/21 5.9s} Kxf5 {+0.90/18 3.5s} 46. Rexe4 {-0.15/20 6.1s} Rg6+ {+0.54/19 3.6s} 47. Kf2 {-0.08/23 6.2s} Rxh3 {+0.80/22 4.8s} 48. Re8 {-0.68/20 7.8s} Rh2+ {+0.71/24 4.7s} 49. Kf3 {-0.29/20 5.3s} Rg1 {+0.65/23 5.4s} 50. Rf8+ {-0.97/20 7.8s} Kg6 {+0.54/22 5.3s} 51. f5+ {-0.90/20 9.7s} Kg5 {+0.73/18 3.9s} 52. Rf7 {-0.82/23 5.6s} Rb1 {+0.88/21 5.5s} 53. Rd3 {-1.43/21 9.0s} Rxb4 {+1.10/23 5.7s} 54. Rxg7+ {-2.21/20 4.7s} Kxf5 {+1.27/20 4.6s} 55. Kg3 {-2.14/22 8.8s} Rbb2 {+1.33/23 3.9s} 56. Rf3+ {-1.85/22 6.7s} Ke5 {+1.26/22 3.7s} 57. Re7+ {-2.17/21 4.2s} Kd4 {+1.27/22 3.9s} 58. Rd7 {-1.80/21 8.3s} Rhg2+ {+1.15/24 3.7s} 59. Kh4 {-1.75/17 5.1s} Rg4+ {+1.42/21 6.4s} 60. Kxh5 {-2.77/20 7.9s} Rg8 {+1.44/25 5.4s} 61. Rf4+ {-2.61/22 7.6s} Ke5 {+1.61/22 3.1s} 62. Rh4 {-2.74/27 4.7s} b5 {+1.66/25 3.2s} 63. Kh6 {-3.92/20 7.3s} Rh8+ {+1.68/26 5.3s} 64. Rh7 {-4.78/23 4.5s} Rxh7+ {+1.66/26 3.5s} 65. Kxh7 {-5.12/25 4.8s} b4 {+1.65/25 3.0s} 66. Rh6 {-5.23/25 6.3s} Rd2 {+1.55/24 3.2s} 67. Re6+ {-5.53/27 6.7s} Kd4 {+1.72/23 3.0s} 68. Rxd6 {-5.56/30 4.0s} Kxc4 {+1.74/23 2.8s} 69. Rc6+ {-5.56/30 3.3s} Kxd5 {+1.75/24 4.1s} 70. Rc7 {-5.58/28 4.2s} Kd4 {+1.79/26 3.1s} 71. Rb7 {-6.00/28 3.4s} Kc3 {+1.83/25 2.8s} 72. Kg6 {-6.00/31 6.4s} b3 {+1.89/27 2.9s} 73. Kf5 {-6.00/30 3.5s} Rd4 {+1.85/33 5.1s} 74. Ke5 {-6.13/29 5.4s} Rb4 {+1.80/39 5.3s} 75. Rc7+ {-6.17/28 3.9s} Kd2 {+1.74/32 3.5s} 76. Rd7+ {-6.25/28 3.4s} Ke2 {+1.73/36 5.4s} 77. Rh7 {-6.98/28 3.1s} b2 {+1.71/8 0.017s} 78. Rh2+ {-7.02/24 6.0s} Ke3 {+1.65/34 3.6s} 79. Rh3+ {-17.68/19 5.8s} Kf2 {+1.65/41 4.2s} 80. Rh1 {-M182/22 5.1s} Rb8 {+1.63/49 14s} 81. Rb1 {-6.98/26 5.6s} Ke3 {+1.63/32 2.9s} 82. Kf5 {-17.68/21 3.9s} Kd3 {+1.64/35 5.4s} 83. Ke6 {-17.68/20 3.3s} Kc2 {+1.69/40 7.9s} 84. Rh1 {-M168/22 3.5s} b1=Q {+1.70/27 5.7s} 85. Rxb1 {-M36/28 2.8s} Rxb1 {+1.77/34 2.3s} 86. Kd5 {-M28/32 2.7s} Kd3 {+1.75/30 2.8s} 87. Ke5 {-M26/35 2.7s} Ke3 {+1.73/31 2.6s} 88. Kd5 {-M26/37 3.0s} Rf1 {+1.79/34 5.3s} 89. Kc5 {-M24/40 3.3s} Kf4 {+M33/37 4.2s} 90. Kd4 {-M24/43 3.2s} Rb1 {+1.73/32 3.9s} 91. Kc5 {-M24/43 2.7s} Ke4 {+1.74/33 5.3s} 92. Kd6 {-M22/45 3.0s} Rc1 {+M37/40 2.1s} 93. Ke6 {-M22/47 2.7s} Kf4 {+M47/44 4.8s} 94. Kd6 {-M24/46 2.8s} Kf3 {+M43/36 4.8s} 95. Kd5 {-M26/47 2.8s} Rh1 {+1.59/37 3.3s} 96. Kc6 {-M24/47 2.9s} Rd1 {+M37/37 3.1s} 97. Kb5 {-M22/50 3.3s} Rc1 {+M49/43 3.4s} 98. Kb4 {-M20/52 3.4s} Ke4 {+M19/46 2.9s} 99. Kb5 {-M18/53 3.2s} Kd4 {+M9/52 1.9s} 100. Kb6 {-M16/53 2.9s} Kc4 {+M11/35 11s} 101. Kc6 {-M16/54 3.9s} Rd1 {0.00/4 7.2s} 102. Kb6 {-M14/56 3.8s} Kb4 {0.00/4 5.1s} 103. Kc6 {-M14/55 3.9s} Rd2 {0.00/4 4.1s} 104. Kb6 {-M12/57 2.9s, makes an illegal move: a1a1} 1/2-1/2
Vast342 commented 9 months ago

Do you have any more information? logs or something like that? Trying that position locally it simply alternates between finding mate in 5-8 moves.

tissatussa commented 9 months ago

Do you have any more information?

i compiled your v5.0.0 source on Linux Xubuntu 22.04. i set 128 Mb Hash and 2 threads. look at the moves it played after taking the last White piece (Rook) by 85...Rxb1 : 8/8/4K3/8/8/8/2k5/1r6 w - - 0 86 from here, Clarity v5.0.0 doesn't seem to have a clear mating plan .. according to SF 16 it's mate-in-15 in that position, so with optimal play the game should have ended on move 100 at max .. my feeling is that its logic gets 'confused' when so less pieces are left on the board .. in general it never had any error or illegal move ..

logs or something like that?

no log preserved .. i used CuteChess, a log output contains just the UCI strings by the engines .. when CuteChess gets 'best move a1a1' it shows the message "makes an illegal move: a1a1", there's no crash, CuteChess just decides to end the game (and conclude a draw in this case) .. maybe when i start CuteChess in terminal (which i didn't) i could have seen more info lines, error outputs (non-UCI-string) by Clarity .. i had such experience with other engines (in other cases). Pitty..

Trying that position locally it simply alternates between finding mate in 5-8 moves.

sure .. this bug (?) will be hard to trace ..

Vast342 commented 9 months ago

The move history in the screenshot you sent and in the pgn you sent suggests that something happened a few moves earlier that may have caused it to get stuck, since it says it searched to a depth of 4 and was saying a draw.

Vast342 commented 9 months ago

My current theory is that something happened after move 100 that led to an improper search being performed, and from there a few depth later it ran out of moves in the Transposition Table and then outputted the null move (a1a1 is how I store null moves internally)

tissatussa commented 9 months ago

good thinking i guess .. i encounter sometimes other engines output 'a1a1' also - will be same coding .. yes, that TT / Hash can be buggy, and hard to detect / prove .. it happens AFTER many moves, then the TT might get corrupted ? HTH

bug-vs-defect-vs-error-vs-fault-vs-failure2

Vast342 commented 9 months ago

What is your CPU / How many nodes per second were the 2 engines getting during the game above? I'm considering trying to recreate this game locally with the 2 engines but I'm not sure if it would be similar enough.

Vast342 commented 9 months ago

I'm also not sure if I can obtain Winter 2.04b

tissatussa commented 9 months ago

..I'm also not sure if I can obtain Winter 2.04b..

i think that's beyond the problem .. it could have been any other engine, it can happen after 85 moves or so, with a full (?) TT with some incorrect logic. In the meantime i may do more matches to see if such error ever happens again .. not sure .. could you think of rewriting some part of the TT logic ? i have no clue ..

How many nodes per second were the 2 engines getting during the game above?

sorry, i didn't mention .. i didn't expect it to happen either ..

$ neofetch

           `-/osyhddddhyso/-`              roelof@roelof-HP-Elite-x2-1012-G2 
        .+yddddddddddddddddddy+.           --------------------------------- 
      :yddddddddddddddddddddddddy:         OS: Xubuntu 22.04.2 LTS x86_64 
    -yddddddddddddddddddddhdddddddy-       Host: HP Elite x2 1012 G2 
   odddddddddddyshdddddddh`dddd+ydddo      Kernel: 5.15.0-71-generic 
 `yddddddhshdd-   ydddddd+`ddh.:dddddy`    Uptime: 4 days, 2 hours, 1 min 
 sddddddy   /d.   :dddddd-:dy`-ddddddds    Packages: 3064 (dpkg), 15 (snap) 
:ddddddds    /+   .dddddd`yy`:ddddddddd:   Shell: bash 5.1.16 
sdddddddd`    .    .-:/+ssdyodddddddddds   Resolution: 1920x1080, 1920x1080 
ddddddddy                  `:ohddddddddd   DE: Xfce 4.16 
dddddddd.                      +dddddddd   WM: Xfwm4 
sddddddy                        ydddddds   WM Theme: Default 
:dddddd+                      .oddddddd:   Theme: Greybird [GTK2/3] 
 sdddddo                   ./ydddddddds    Icons: elementary-xfce-darker [GTK2/3] 
 `yddddd.              `:ohddddddddddy`    Terminal: xfce4-terminal 
   oddddh/`      `.:+shdddddddddddddo      Terminal Font: DejaVu Sans Mono 9 
    -ydddddhyssyhdddddddddddddddddy-       CPU: Intel i5-7200U (4) @ 3.100GHz 
      :yddddddddddddddddddddddddy:         GPU: Intel HD Graphics 620 
        .+yddddddddddddddddddy+.           Memory: 5651MiB / 7828MiB 
           `-/osyhddddhyso/-`
Vast342 commented 9 months ago

I have a few instances of this issue occurring in the past, however I had assumed they were due to my CPU being fully utilised at the time since they were all coincidentally while I was compiling. That does not seem to be the case though. As of now I am unable to reproduce the error, nor do I have any idea what is causing it yet.

Vast342 commented 9 months ago

When playing through your example game and the ones I have on record, Clarity doesn't seem to agree with most of the moves played there, nor did it agree with the scores. This has led me to believe that some part of it is a remnant of a previous game. Through that, I noticed that one of my history tables wasn't getting cleared properly in between games. Thank you for bringing this issue to my attention.

Vast342 commented 9 months ago

I have now made the fixing branch and committed my 2 fix ideas, and I am going to leave a test of that branch against itself running overnight to see if there are any illegal moves. If there are none by the time 10000 games have been completed, I am going to deem this issue as fixed until there is evidence otherwise.

tissatussa commented 9 months ago

..This has led me to believe that some part of it is a remnant of a previous game..

that's also my feeling .. as i stated : yes, that TT / Hash can be buggy, and hard to detect / prove .. it happens AFTER many moves, then the TT might get corrupted ? In the meantime i let Clarity v5.0.0 play a few more games (15min+3bonus) against several engines and i had no troubles of any kind so far ..

tissatussa commented 9 months ago

..Through that, I noticed that one of my history tables wasn't getting cleared properly in between games..

yes, such things .. good luck finding the bug !

Vast342 commented 9 months ago

In the meantime i let Clarity v5.0.0 play a few more games (15min+3bonus) against several engines and i had no troubles of any kind so far ..

yeah, as always these bugs happen very few times and with a lot of time in between them.

as of right now the test is at 8944 games out of 10000 and there have yet to be illegal moves.

Vast342 commented 9 months ago

One of my fixes is a backup in case a null move gets returned by the engine, however since it's just a covering up of the problem and not a fix to the problem itself, I am going to do another 10000 game test without it just to make sure that the issue has been fixed fully.

tissatussa commented 9 months ago

..Through that, I noticed that one of my history tables wasn't getting cleared properly in between games..

so you did correct a code part ? or maybe you found a severe flaw ?

Vast342 commented 9 months ago

I have corrected that, as well as adding a limiter to a feature that is known to cause search size explosions.

Vast342 commented 9 months ago

Both 10000 game tests have finished with no illegal moves, therefore I am deeming that these issues are fixed until I discover evidence otherwise.

Vast342 commented 9 months ago

The actual cause of this issue has now been found, and it appears to be that the GUI gave Clarity a negative amount of time to search, and as such Clarity just returned a null move after doing no search at all. I can now confirm that that has been fixed.

tissatussa commented 9 months ago

..the GUI gave Clarity a negative amount of time to search..

that's weird ! how did you found out ? You must have been thinking out-of-the-box, i'd never thought about this .. as you'd noticed, i gave 3 seconds bonus time, but how could time be negative ? Is this a bug in (the GUI of) CuteChess ?