Closed tissatussa closed 1 year ago
(indeed) it's all about Pawns.
did you know in some East South Europe countries the chess game had names for every pawn? Eg. Armeta, Bertolo, Christoph .. like characters in a play, relating to eachother .. and tales and teachings existed .. eg. Pawn uses Christoph often like i prefer him : move only when needed or useful, it serves a good role on its starting square c2/c7 already .. Pawn also tends to not-capture pieces, only when needed or useful .. I have no clue how your code is done, i'm a programmer but no C++ .. i read it though .. i find those SF PSTs and wonder : can they reflect Pawn roles like this ?
another great result : Pawn just defeated Velvet, a strong 3285 engine ! What's happening ?
see https://lichess.org/7XpSRzr9
[Event "Pawn vs engine"]
[Site "Holland"]
[Date "2022.11.04"]
[Round "?"]
[White "Pawn 221103"]
[Black "Velvet v4.1.0"]
[WhiteElo "2700"]
[BlackElo "3285"]
[Result "1-0"]
[ECO "C69"]
[GameDuration "00:27:54"]
[Opening "Ruy Lopez"]
[PlyCount "171"]
[TimeControl "600+3"]
[Variation "Exchange Variation , 5.O-O"]
1. e4 {+0.31/24 26s} e5 {-0.19/21 19s} 2. Nf3 {+0.21/22 17s}
Nc6 {-0.20/20 0.38s} 3. Bb5 {+0.29/25 44s} a6 4. Bxc6 {+0.32/26 41s}
dxc6 {-0.16/22 22s} 5. O-O {+0.29/27 38s} Bg4 6. h3 {+0.26/26 36s} Bh5
7. g4 {+0.38/23 21s} Bg6 {-0.08/23 26s} 8. Nxe5 {+0.37/26 33s} f6
9. Nxg6 {+0.73/21 13s} hxg6 {+0.03/22 7.9s} 10. Qf3 {+0.58/24 30s}
Qd7 {+0.06/22 0.005s} 11. Rd1 {+0.64/24 28s} f5 {-0.20/22 46s}
12. d4 {+0.71/21 16s} O-O-O {-0.18/23 7.5s} 13. Nc3 {+0.72/25 26s} Nh6
14. Bxh6 {+0.61/21 8.4s} Rxh6 {-0.36/22 7.4s} 15. Rd3 {+0.72/24 24s} Bb4
16. gxf5 {+0.76/22 22s} Kb8 {-0.34/24 53s} 17. fxg6 {+0.66/22 12s}
Rxg6+ {+0.39/20 17s} 18. Kh2 {+0.44/28 20s} Rf8 {/23 0.43s}
19. Qe3 {+0.31/26 19s} Rgf6 {+0.69/23 18s} 20. Rf1 {+0.68/23 9.3s}
Rf3 {+0.45/24 10s} 21. Qxf3 {+0.83/24 5.2s} Rxf3 {+0.73/23 9.3s}
22. Rxf3 {+0.78/27 18s} Qxd4 {/24 2.2s} 23. Rd3 {+0.72/29 17s}
Qc4 {+0.20/20 11s} 24. Re1 {+0.70/23 4.5s} Qc5 {+0.26/22 4.9s}
25. Kg2 {+0.65/26 16s} Qe5 {/25 3.9s} 26. a3 {+0.52/25 15s} Bd6 {+0.80/24 19s}
27. Kf1 {+0.47/25 14s} b5 {+0.73/24 19s} 28. Ree3 {+0.46/28 13s}
a5 {+0.68/22 0.65s} 29. Re1 {+0.45/27 13s} Kb7 {+0.78/21 12s}
30. Rde3 {+0.42/28 12s} b4 {+0.89/24 18s} 31. axb4 {+0.44/23 5.2s}
axb4 {+0.67/21 4.9s} 32. Nd1 {+0.39/27 11s} g6 {+0.52/24 41s}
33. b3 {+0.34/24 5.4s} Bc5 {+0.55/22 10s} 34. Rg3 {+0.71/21 3.3s}
Bd4 {0.00/25 41s} 35. Kg2 {+1.07/22 3.3s} Qf4 {0.00/25 10s}
36. Re2 {+1.39/24 5.0s} Be5 {0.00/28 8.4s} 37. Ree3 {+1.42/28 10s}
g5 {0.00/26 32s} 38. Rg4 {+1.55/26 7.0s} Qf6 {0.00/27 2.2s}
39. Re1 {+1.54/28 9.6s} Ka6 {0.00/27 9.6s} 40. Ne3 {+1.70/26 6.6s}
Bf4 {0.00/30 3.1s} 41. Nf5 {+1.64/29 8.9s} Qc3 {-0.43/26 27s} 42. Re2 {8.5s}
Qe5 {/25 2.6s} 43. f3 {+1.83/29 8.1s} Qb5 {0.00/26 7.8s} 44. Rf2 {+2.02/26 2.8s}
Qe5 {0.00/27 8.1s} 45. h4 {+1.87/30 7.8s} gxh4 {/24 2.9s}
46. Nxh4 {+1.97/30 7.5s} Qd6 {-0.57/24 19s} 47. Nf5 {+1.97/26 4.4s}
Qe5 {-0.53/27 5.6s} 48. Re2 {+2.00/28 7.1s} Kb6 {-0.39/24 1.9s}
49. Nh4 {+2.17/24 2.2s} Qb5 {-0.59/25 20s} 50. Kf1 {+2.50/29 6.9s}
Be5 {-1.09/26 19s} 51. f4 {+2.31/29 6.6s} Bc3 {-1.15/24 25s}
52. f5 {+2.54/27 6.4s} c5 {-1.43/22 16s} 53. Nf3 {+2.40/22 2.4s}
Qd7 {-1.37/19 1.9s} 54. Kf2 {+2.34/24 4.7s} c6 {-1.49/23 17s}
55. Rf4 {+2.68/23 2.1s} Bd4+ {-1.89/23 14s} 56. Kg3 {/28 4.1s}
Qg7+ {-2.18/24 15s} 57. Rg4 {3.2s} Be5+ {-2.50/21 8.4s} 58. Nxe5 {+2.65/27 3.0s}
Qxe5+ {-3.33/24 9.3s} 59. Kg2 {+2.83/28 3.0s} Qf6 {-3.59/23 9.5s}
60. Rg3 {+2.70/21 2.1s} Qe5 {-1.65/20 11s} 61. Rf3 {+3.60/25 2.1s}
Qg7+ {-2.74/23 11s} 62. Kf1 {+4.10/28 4.4s} Qf6 {-3.84/21 13s}
63. Ree3 {+4.64/25 2.9s} c4 {-4.17/21 9.2s} 64. bxc4 {+4.82/29 3.3s}
c5 {-4.93/19 5.2s} 65. e5 {+6.03/29 3.1s} Qh4 {-6.48/20 6.4s}
66. f6 {+7.38/31 3.0s} Qxc4+ {-4.89/24 11s} 67. Ke1 {/30 3.0s}
Qh4+ {-3.52/18 3.3s} 68. Ke2 {+7.44/30 3.0s} Qh2+ {/20 0.27s}
69. Rf2 {+6.87/29 3.0s} Qh5+ {-4.91/18 0.25s} 70. Kd2 {+8.17/26 3.0s}
c4 {-7.46/20 9.8s} 71. f7 {+18.66/22 1.5s} c3+ {-9.34/18 1.4s}
72. Ke1 {+22.26/27 4.2s} Qh1+ 73. Ke2 {+22.56/24 2.6s} Qh5+ {/18 0.82s}
74. Rff3 {+24.82/26 3.6s} Qh7 {-15.57/18 4.4s} 75. f8=Q {+35.35/29 3.1s}
Qh2+ {-33.60/19 7.2s} 76. Kd3 {+M21/29 2.6s} Kb5 {-M20/26 0.62s} 77. Qb8+ {3.3s}
Kc5 78. Qc7+ {+M17/34 3.1s} Kb5 79. Qc4+ {+M15/37 3.0s} Ka5 {0.56s}
80. Qc5+ {+M13/40 3.0s} Ka4 {-M12/39 0.67s} 81. Re4 {+M11/45 3.0s}
Qd2+ {-M10/52 0.30s} 82. Kc4 {+M9/115 3.0s} Ka3 {-M8/77 0.73s}
83. Qxb4+ {+M7/56 2.2s} Ka2 84. Qb3+ {+M5/199 1.9s} Ka1
85. Rf1+ {+M3/199 0.027s} Qc1 86. Rxc1# {+M1/199 0.010s, White mates} 1-0
Thanks for the games. The results are in line with the Elo gains in self-play that I tested a few weeks ago:
Date | Elo | +/- |
---|---|---|
2022-08-30 | 369 | 17 |
2022-07-30 | 280 | 15 |
2022-07-15 | 264 | 14 |
2022-06-21 | 191 | 13 |
2022-06-13 | 135 | 12 |
2022-05-24 | 61 | 12 |
2022-05-10 | 9 | 11 |
2022-04-01 | -64 | 12 |
2022-04-17 | -69 | 12 |
2021-10-29 | -101 | 12 |
2021-10-01 | -150 | 12 |
2021-09-15 | -201 | 13 |
2021-08-22 | -370 | 18 |
2021-08-31 | -371 | 18 |
I've been playing with a concept around the PSQT, and the Elo gains are looking promising.
thanks. i'm glad you didn't close this Issue yet .. at this moment i try your newest version -compiled after git clone- against the same 12 engines, again with 10 minutes, to guess Pawns' rating .. i remember Pawn used to open with 1.d4 (also) but now it prefers 1.e4 almost every game, only 1 time it did 1.Nf3 .. note: for these challenges i set no starting FEN and no opening book .. i must admit i wonder why some engines are rated high in the CCRL lists, eg. Cheese tends to play the Aljekin Defence (1.e4 Nf6?!) and gets destroyed every game ..
I've been playing with a concept around the PSQT, and the Elo gains are looking promising.
your newest version, with PSQTs for Kings, seems a very nice idea .. i just wrote [#13] .. it's not an NN, isn't it ? Although you 'train' with 8M ..
about self-play : i once stated self-play would not be a good reference and trainer partner .. i read some discussions : playing against other engines may confront the trainer with unexpected variations, leading to better style improvement ?
i like to discuss your PSQT concept, can you elaborate ?
The latest versions of pawn seem to like the French and the Spanish. If an engine does not have an opening book, it relies on the search to make the first moves, and it is easy to make the middlegame heuristics not accurate during the opening.
The PSQT is not an NN but can be seen as some sort of input to a neuron in the first input layer of a NNUE. Basically, instead of relying on a fixed PSQT, there are 64 different tables depending on the square our king is at. Also, another set of tables is added that depends on the position of the enemy king, which can be seen as some sort of "attacking" table to the enemy king. Finally, to reduce dimensionality, these defence/attack tables are mirrored if the kings are in files E to H, which allows reducing from 64 to 32 possible king squares. The encoding is similar to what you see in NNUE (see this for reference) but somewhat simplified to [piece square, piece type, mirrored king square, defence/attack]
, with dimensions [64, 5, 32, 2]
, which sums to 20480 features.
As an example, this position (6k1/8/3r4/8/8/8/P7/1K6 w - - 0 1): has the following active features (recall that we mirror the perspective for black since the king is in the G file):
[A2, Pawn, B1, Defence]
for White[H2, Pawn, B8, Attack]
for White[E6, Rook, B8, Defence]
for Black[D6, Rook, B1, Attack]
for BlackIn practice, the final PSQT is derived from the linear regression problem y=Xw
, where X
are the trained weights and w
are the active features.
AFAIK, this idea is not new, since NNUEs (at least SF's) also train a PSQ term. The main "advantage" of this approach is that it simply replaces one term of the classical evaluation instead of replacing it entirely with a black box net.
The latest versions of pawn seem to like the French and the Spanish.
so, this happens when Pawn is Black .. until this moment i only let Pawn play White, so i don't know its Black preference .. does 1.e4 suit Pawns' style ? Or can it play the first White move ramdomly from a set like e4, d4, Nf3 ? That would be nice then.
what about the other piece types ? Or does the trained PSQT idea only apply to the King ?
It generally prefers 1.e4, but I've seen it play all the three, depending on version, threads, hash, depth, etc.
In theory, these tables can depend on more pieces, but one quickly realises the curse of dimensionality of doing so. In this aspect, I would probably firstly try to make these PSQT tapered or with some sort of bucketing for different material configurations.
well said, thanks! Indeed settings lead to other openings, i also encountered that.
different material configurations
your wording is clear but abstract .. as far as i can understand and imagine an engine could 'strive' for configurations / positions which suit its style ? Eg. avoid isolated / double pawns, etc.
By different material configurations I mean a different number of pieces on the board. For instance, if several pieces have been traded, the king safety aspect of the tables likely is less important.
..I mean a different number of pieces on the board..
it's just counting pieces ? to me that seems too general .. what about their positions and relations (capture / cover / block) ? Or is this aspect 'solved' by pruning the variations until some depth ? I'm not a chess engine programmer, just doing software and chess .. i enjoy your kind explanations though.
btw. i'm almost done with a seond run : the newest Pawn played several new games (only with White) against strong engines and it seems it can beat 2900+ and not lose against 3000+ .. soon i will send those games.
Here's one nice game which shows Pawns' style, i think. Pawn played against engine Kohai v1.0 (with rating 2970, see https://github.com/MichaelB7/Kohai-Chess - no longer maintained) .. Pawn suddenly played 3.Bc4 (the Italian game) and not its regular 3.Bb5 (the Spanish game = Ruy Lopez) .. then Pawn played the very aggressive Fried Liver attack, resulting in a fearless battle where White sacrifices a piece but it drives the opponent King into the middle of the board .. the Fried Liver is a well known variation, very sharp, i remember it's winning for White, but i don't recall how .. in the first left diagram Pawn played Bb3, it's difficult to keep the initiative and the attack, an opening book may be useful now (but i disable it) .. here the moves d4 and O-O are alternatives .. it resulted in the second right diagram, where Pawn played Qxf8! too keep complications on the board, but letting Black capture Qxg2, so the game remained dynamic .. it ended in a draw.
ZIPped pgn : pawn_kohai_fried_liver.zip (contains clock times)
[Event "engine vs engine"] [Site "Holland @ https://lichess.org/T3Yhj1P1 "] [Date "2022.11.08"] [White "Pawn 221106 dynPSQTk"] [Black "Kohai v1.0"] [Result "1/2-1/2"] [ECO "C57"] [GameDuration "00:35:01"] [Opening "Two knights defense"] [PlyCount "307"] [TimeControl "600+3"] [Variation "Fegatello attack"]
1.e4 e5 2.Nf3 Nc6 3.Bc4 Nf6 4.Ng5 d5 5.exd5 Nxd5 6.Nxf7 Kxf7 7.Qf3+ Ke6 8. Nc3 Nb4 9.Bb3 Bc5 10.d4 Bxd4 11.Nxd5 Nxd5 12.c3 Rf8 13.Bxd5+ Qxd5 14.Qxf8 Qxg2 15.cxd4 Qxh1+ 16.Ke2 Qe4+ 17.Be3 exd4 18.Qe8+ Kd5 19.Qd8+ Kc6 20.Rc1+ Kb5 21.Qxd4 Bg4+ 22.Kd2 Qxd4+ 23.Bxd4 Rd8 24.Ke3 Rd7 25.Bxa7 b6 26.Bb8 c5 27.f3 Re7+ 28.Kf2 Rf7 29.Rc3 Be6 30.a3 Bd5 31.Rd3 Be4 32.Rb3+ Kc6 33.Kg3 Bc2 34.Re3 Kb5 35.Bd6 Rd7 36.Bf4 g6 37.Bg5 Bd1 38.Bf6 Rd5 39.Bg7 Rg5+ 40. Kf2 Rf5 41.Bh8 Rf8 42.Bc3 Kc4 43.Kg3 Rf7 44.Bh8 Kb5 45.Kg4 Ka4 46.Kg3 b5 47.Rd3 Bb3 48.Be5 b4 49.Rd8 bxa3 50.Ra8+ Kb5 51.Rxa3 Bd1 52.Rc3 Rd7 53.Ra3 Be2 54.Rb3+ Kc6 55.Bf4 Rf7 56.Rc3 Rb7 57.Be5 Rd7 58.Bf4 Kb5 59.Rb3+ Ka4 60.Re3 Bc4 61.Re8 Ba6 62.Be3 c4 63.Ra8 Kb5 64.Rf8 Bb7 65.Bf4 Kc5 66.Be5 Bc6 67.Bc3 Re7 68.Kf4 Re2 69.h4 Rf2 70.Kg3 Rf1 71.Rf7 h5 72.Rf6 Rh1 73.Bd2 Kb5 74.Be3 Rf1 75.Bf2 Be8 76.Re6 Bf7 77.Re7 Bd5 78.Rg7 Rd1 79.Rxg6 Ka4 80. Be3 Kb3 81.Rb6+ Kc2 82.Kf2 Rd3 83.Ke2 Rd1 84.Rh6 Rh1 85.Rxh5 Bf7 86.Rh7 Bg6 87.Rb7 Rxh4 88.Kf2 c3 89.bxc3 Kxc3 90.Rc7+ Rc4 91.Rd7 Rh4 92.Rd6 Bd3 93.Bg5 Rh1 94.Bf6+ Kd2 95.Rd5 Rf1+ 96.Kg2 Re1 97.Rc5 Re6 98.Bg5+ Ke2 99. Kg3 Kf1 100.Rd5 Be2 101.Ra5 Rd6 102.Ra1+ Rd1 103.Rxd1+ Bxd1 104.f4 Bc2 105.Kf3 Bd1+ 106.Ke4 Bc2+ 107.Ke5 Ke2 108.Kd5 Kf3 109.Bh6 Bg6 110.Ke5 Bc2 111.Bg7 Bd3 112.Bf6 Ke3 113.Bh4 Bh7 114.Bg5 Kf3 115.Be7 Bg6 116.Bc5 Bc2 117.Ba3 Ke3 118.Bb2 Kf3 119.Bc3 Bd3 120.Bd4 Kg4 121.Bc5 Kf3 122.Bb6 Bc2 123.Ba7 Bg6 124.Bb8 Ke3 125.Kf6 Bh7 126.Bd6 Ke4 127.Kg5 Bf5 128.Bb4 Bd7 129.Be7 Be6 130.Bc5 Bd7 131.Bf8 Be6 132.Bg7 Bd7 133.Be5 Bb5 134.Bc3 Bd7 135.Bf6 Bh3 136.Bd8 Be6 137.Ba5 Bh3 138.Bd2 Bf5 139.Kh5 Bh3 140.Kg6 Bf5+ 141.Kg5 Bh3 142.Bc1 Bf5 143.Be3 Be6 144.Ba7 Bh3 145.Bf2 Bd7 146.Bg3 Bh3 147.Bh2 Bf5 148.Kh5 Kf3 149.Kh4 Bc8 150.Bg3 Bd7 151.Bh2 Bf5 152.Bg3 Bd7 153.Bh2 Bf5 154.Bg3 1/2-1/2
it's just counting pieces ? to me that seems too general .. what about their positions and relations (capture / cover / block) ? Or is this aspect 'solved' by pruning the variations until some depth ? I'm not a chess engine programmer, just doing software and chess .. i enjoy your kind explanations though.
If different weights are assigned to each piece, the sum of the weights is a very good indicator of the game phase. This is how pawn and almost any classical engine interpolate between middlegame and endgame evaluation terms. The search should handle tactical ideas of the position.
i finished my session in CuteChess, letting Pawn play 64 games with White against several engines, 10 minutes per player per game, plus 3 seconds bonus after each move.
Most opponent engines are strong, their rating is from the CCRL lists. Pawn seems to score very well ! I guess its ELO is even 2850+ .. here's the ZIPped pgn file : pawn_221106_games_10min3sec.zip
1868 Rustic Alpha v3.0.2 1-0
2615 Zappa v1.1 1-0
2652 Ghost v3.1 1-0
2656 Phalanx XXV 1-0
2685 Knightx v3.5 1-0
2690 Leorik v2.2 1-0
2696 Booot v4.15.1 1-0
2716 K2 v0.992 dev 1-0
2745 Spike v1.2 Turin 1-0
2757 Discocheck v4.2.1 1-0
2759 Fridolin v4.0 1-0
2785 Octochess 1-0
2791 RuyDos v1.1.11 1-0
2800 Coiled v1.1 noNNUE 1-0
???? Coiled v1.1 NNUE 1-0
2823 Murka v3 1/2-1/2
2844 Weiawaga v5.0.0 NNUE 1/2-1/2
2845 Olithink v5.10.1 1-0
2850 Cheese v3.01 1-0
2850? Toga II v1.2.1a 1-0
2872 Gaviota v1.0 1/2-1/2
2891 Spark v1.0 1-0
2894 EXchess v7.97 beta 1-0
2897 Atlas v3.91 1-0
2899 Discocheck v5.2 1-0
2900 SCTR v1.1f 0-1
2901 Tomitank v5.1 1-0
2914 SF Tinapa v1.01 0-1
2916 Dirty CUMCUMBER 1-0
2922 Toga II v4.01 0-1
2928 Crafty v25.4 1-0
2953 Hakkapeliitta v3.0 1-0
2970 Kohai v1.0 1/2-1/2
2976 PeSTO v2.210 1/2-1/2
2979 Deuterium v2020.1.38.5 1-0
2979 Godel v7.0 1-0
2996 FabChess v1.16 1-0
3002 Frozenight v5.1.0 NN 1-0
???? Frozenight v6.0.0-dev NN 1/2-1/2
3006 Asymptote v0.8.0 1/2-1/2
3006 Topple v0.8.1 1/2-1/2
3013+ Gogobello v3.0 0-1
3038 Elektro v1.2 1/2-1/2
???? Elektro v2.0 1/2-1/2
???? Elektro v2.0 DR 1-0
3055 BlackMamba v2.0 1-0
3056 Vajolet v2.6.2 1-0
3073 Mr Bob v1.1.0 1/2-1/2
3097 Beef v0.3.6 1-0
3104 Cheng v4.42 1-0
3112 IvanHoe 999946f 1/2-1/2
3136? Toga III v0.3.12 1/2-1/2
3138 Demolito 0-1
3202 Schooner v2.2 1/2-1/2
3321 Komodo v12.1.1 0-1
3334 Pedone v3.1 0-1
3373? LC0 v0.26.1 0-1
3373? LC0 v0.28.2A 1/2-1/2
3373? LC0 v0.28.2C 1-0
3433 Combusken v2.0.0 1/2-1/2
???? Chess-At-Nite v10.2.23 1-0
???? Eggnog 221106 1-0
???? IIchess r19 1-0
???? WukongJS v1.5a 1-0
[ i'm on Xubuntu 22.04 ]
Closing some old issues
Your new Pawn version plays very well, congrats! Not long ago i wrote several issues while testing Pawn .. it seems evolved ! Remember, in those days i made a script to estimate its strength, which was about 2350 .. lately i tested again, with your newest version (compiled by git clone), and it seems over 2700 !? I attached a ZIPped pgn file Pawn_games_10min3sec.zip with 12 games, 10 minutes + 3 sec, against some engines with rating about 2700, according to the list http://ccrl.chessdom.com/ccrl/4040/rating_list_all.html -- there you can find links to those engines and their source code.
Pawn plays White in all games. It wins almost all battles in a style i admire : gladly sacrificing material for initiative and attack !
i want to show one example position, to illustrate Pawn style :
Pawn 221103 -vs- Weiawaga v5.0.0 NNUE White to move position after 20...Nh4 7r/1pp1kpp1/2p4p/3bP3/6Pn/1PR4P/r1P2P2/2BRN1K1 w - - 0 21
here Pawn played Rxd5! after 20 seconds with eval +0.88
i was wondering if other (very strong) engines would give the same best move .. well, it seems only Pawn STRONGLY prefers this exchange sacrifice .. when i let Pawn think for a very long time (in SCID or Nibbler) using many MPV, it keeps prefering Rxd5 but not thAt strong .. this position seems a nice test case concerning playing style (thus pruning mechanism, PSTs .. i guess).
here are some statistics:
no clue how this could help you improve Pawn, but i can pick other positions like this from the pgn .. what do you say ? sure, all this is just an indication .. maybe let Pawn play more and other games, with different (time) settings, to better determine its rating .. also my hardware must be considered : i use just a notebook, rather modern, and i set 128 Mb Hash and 2 threads for an engine, but i have no clue about optimal performance settings ..