Closed SzotsGabor closed 3 years ago
Interesting, my tests showed that (well, with self play):
https://github.com/amanjpro/zahak/pull/49#issuecomment-814260987
https://github.com/amanjpro/zahak/pull/49#issuecomment-814261970
mmm... most likely the problem is with self-play. I think it is time for me to expand the pool of the engines, on which I test Zahak against
I would be grateful to see PGNs, thanks a lot
I read at CCC that the real difference is about 70 % of what self play shows.
In my tournament Zahak is on 17/68. I attach the PGN.
------ Eredeti üzenet ------ Feladó: "Amanj Sherwany" @.> Címzett: "amanjpro/zahak" @.> Másolat: "Gabor Szots" @.>; "Author" @.> Elküldve: 2021.04.09. 15:48:42 Tárgy: Re: [amanjpro/zahak] +250-300? (#51)
Interesting, my tests showed that (well, with self play):
49 (comment)
https://github.com/amanjpro/zahak/pull/49#issuecomment-814260987
49 (comment)
https://github.com/amanjpro/zahak/pull/49#issuecomment-814261970
mmm... most likely the problem is with self-play. I think it is time for me to expand the pool of the engines, on which I test Zahak for
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amanjpro/zahak/issues/51#issuecomment-816695153, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFRWZ6KBEAWXJILGTERETHTTH4ASXANCNFSM42UWWBDA.
Well, I'm not sure I have sent the PGN. I can't see it here.
You cannot send attachments by email to a github comment, unfortunately :(
Maybe this way.
Thank you a lot :)
FYI, I also measure a great difference in self play.
Score of Zahak_0.3.0-x64 vs Zahak_1.0.0-x64: 1 - 13 - 6 [0.200] ... Zahak_0.3.0-x64 playing White: 0 - 6 - 4 [0.200] 10
... Zahak_0.3.0-x64 playing Black: 1 - 7 - 2 [0.200] 10
... White vs Black: 7 - 7 - 6 [0.500] 20
Elo difference: -240.8 +/- 159.1, LOS: 0.1 %, DrawRatio: 30.0 %
20 of 20 games finished.
This was played at 30s+0,2s TC.
FYI, I also measure a great difference in self play. Score of Zahak_0.3.0-x64 vs Zahak_1.0.0-x64: 1 - 13 - 6 [0.200] ... Zahak_0.3.0-x64 playing White: 0 - 6 - 4 [0.200] 10 ... Zahak_0.3.0-x64 playing Black: 1 - 7 - 2 [0.200] 10 ... White vs Black: 7 - 7 - 6 [0.500] 20 Elo difference: -240.8 +/- 159.1, LOS: 0.1 %, DrawRatio: 30.0 % 20 of 20 games finished. This was played at 30s+0,2s TC.
My explanation is passed pawns, I know I can easily prune promotions, or passed pawn moves... my move ordering doesn't care much about them. This might explain why it can crush weaker engines (old version), but struggles to convert against equal strength engines? as they usually require good endgames? just a theory, not proven yet
So, this is really strange, I have been playing with passed pawns, and currently running a match (still going on):
Rank Name Elo +/- Games Wins Losses Draws Points WWins WLoss. WDraws BWins BLoss. BDraws
0 zahak_dev -27 53 130 46 56 28 60.0 25 25 15 21 31 13
1 baislicka 289 185 22 16 1 5 18.5 8 1 2 8 0 3
2 Achillees 191 157 22 14 3 5 16.5 7 1 3 7 2 2
3 gopher_check 64 141 22 11 7 4 13.0 7 3 1 4 4 3
4 zahak-darwin-amd64-1.0.0 -64 131 22 6 10 6 9.0 4 4 3 2 6 3
5 vice -70 136 20 5 9 6 8.0 3 3 4 2 6 2
6 rustic -213 194 22 4 16 2 5.0 2 9 0 2 7 2
Started game 133 of 1200 (zahak_next vs zahak-darwin-amd64-latest)
Looking at gopher-check
, even though it is supposed to be 100 elo stronger than Achillees and Baislicka, but it does much worse than both of them. Not sure what to make with this really
I'll have results tomorrow, and will update here
As promised I came back with my final numbers for the above match. And gopher-check
still does less good than the "lower rated" engines. This is interesting/strange to me. Even looking at vice (which is rated around 2000), does terribly bad against Zahak. I'll be working on Zahak to see the reason behind it.
Rank Name Elo +/- Games Wins Losses Draws Points WWins WLoss. WDraws BWins BLoss. BDraws
0 zahak_next (PR #52 ) -14 18 1200 461 508 231 576.5 251 236 113 210 272 118
1 baislicka 189 49 200 133 34 33 149.5 73 13 14 60 21 19
2 Achillees 166 49 200 129 40 31 144.5 64 22 14 65 18 17
3 gopher_check 109 43 200 104 43 53 130.5 57 15 28 47 28 25
4 zahak-darwin-amd64-1.0.0 -26 40 200 60 75 65 92.5 37 32 31 23 43 34
5 vice (v 1.1) -173 48 200 37 129 34 54.0 16 59 25 21 70 9
6 rustic (1 alpha 2) -179 53 200 45 140 15 52.5 25 69 6 20 71 9
Finished match
And according to bayeselo:
ResultSet>readpgn zahak_games/passed-pawns-1.pgn
1200 game(s) loaded
ResultSet>elo
ResultSet-EloRating>mm
00:00:00,00
ResultSet-EloRating>exactdist
00:00:00,00
ResultSet-EloRating>ratings
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 baislicka 3282 0.0 41 39 200 149.5 74.8 133 34 33 66.5 16.5 3090
2 Achillees 3264 18.4 40 39 200 144.5 72.2 129 40 31 64.5 15.5 3090
3 gopher_check 3196 67.7 37 36 200 130.5 65.2 104 43 53 52.0 26.5 3090
4 zahak_next 3090 106.2 16 16 1200 576.5 48.0 461 508 231 38.4 19.2 3102
5 zahak-darwin-amd64-latest 3066 24.5 34 35 200 92.5 46.2 60 75 65 30.0 32.5 3090
6 vice 2911 154.4 38 40 200 54.0 27.0 37 129 34 18.5 17.0 3090
7 rustic 2890 21.1 41 43 200 52.5 26.2 45 140 15 22.5 7.5 3090
---------------------------------------------------------------------------------------------------------
Δ = delta from the next higher rated opponent
# = number of games played
Σ = total score, 1 point for win, 1/2 point for draw
Zahak 2.0.0 should fullfil the promise
Hi Amanj,
This might not be very useful to you but still.
You claimed a 250-300 Elo improvement by v1.0.0 over v0.3.0. Therefore for my current tournament I selected opponents in the 2160-2200 range (to be on the safe side). However, even this selection has proved too strong. At the time of writing Zahak has a score of 6 out of 30. I watched some of the games. It is hard to point to a salient problem. It seemes to me one of the problems is in endgames with passed pawns. Also, somehow the depth reached is less than that of most of the opponents. Zahak seems to fall for traps.
I may send you the PGN if you are interested. You can find my e-mail address in my CCC profile.
Best regards, Gabor