kz04px / 4ku

A UCI compatible chess engine that fits into 4,096 bytes.
MIT License
52 stars 22 forks source link

4KU2 rated 2000 !? #23

Closed tissatussa closed 1 year ago

tissatussa commented 1 year ago

Congrats with the newer 4KU2 ! Lately i compiled and tested the version "18", where you implemented some eval functions, and it playes very well against 2000 rated engines .. i let 4KU2 play Black in several 10 min + 3 sec games, using CuteChess on Linux (no books, no pondering, 128 Mb Hash, 2 Threads). Here's the ZIPped PGN 4KU2_games.zip

1827  Apollo v1.2.1                0-1
1868  Rustic Alpha v3.0.2          0-1
1882  Princhess v0.8.0             0-1
1888  Surprise v4.3                0-1
1907  Claudia v0.5.1               0-1
1918  Presbyter v1.3.0             0-1
1925  Heracles v0.6.16             0-1
1932  Quokka v2.1                  0-1
1939  Dabbaba v6.52                1-0
1950  Monik v2.2.7                 0-1
1954  Wowl v1.3.8                  0-1
1968  ZetaDVA 0310               1/2-1/2
1990  ALChess v1.84                0-1
1990  Sapeli v2.1                  0-1
1992  Fatalii v0.4.0 alpha         0-1
2008  Leonidas v8.3              1/2-1/2
2026  Cupcake v1.1c                1-0
2027  ArabianKnight v1.55          0-1
2032  Poor Little Pinkus (PLP)     1-0
2033  Clunk_v1.2                 1/2-1/2

4KU2 plays good chess - i see that, being a club player .. it develops naturally, it prefers to let pieces being captured, it exchanges pieces only when needed .. it seems to have plans to place the pieces and use the pawns, in a logical harmonious way .. it can sacrifice pawns when needed .. it may guard pieces by attacking those of the opponent .. also the endgame is played rather well, in most cases - in a few games 4KU2 was lost (-3 / -4) according to the opponent eval, but managed to win.

so, compared to many other well-known (?) amateur engines 4KU2 performs very well, i estimate its rating around 2000.

Note the few Black games with the Berlin Defence, a variation of the Spanish game 'Ruy Lopez' :

  1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Bd6 5. d4 Nxd4 6. Nxd4 a6

WeirdBerlinVariation

the moves 4...Bd6 and also 4KU2's solution 6...a6 are weird, but seem playable .. in general 4KU2 plays the openings rather well and consistently.

kz04px commented 1 year ago

Nice to see the results from your test, thanks.

Please note that 4ku2 does not (currently) parse UCI setoption. To set options such as hash size, you would have to edit the value directly in the source. I admit this is not very user friendly, but it's done in the interest of keeping the file size down.

const int MAX_TT_SIZE = 2000000;
tissatussa commented 1 year ago

it's done in the interest of keeping the file size down

i respect that .. CuteChess also does not show any output during evaluations .. i mentioned hash size, this is a general setting when engines play eachother in CuteChess, just like ponder (CuteChess: "thinking in opponents time"), so an engine that can do such setoption will set the concerning value, in this case 4KU2 will use its own hash settings .. it has no books, hash nor pondering, so these are fair.

btw. i recall the famous (?) "5K Org" contest, years ago, which challenged programmers to create some nice web app below 5K source code .. then some people tried to build a chess engine within that limit, and someone got a price for that : the code is still maintained as P4WN at https://github.com/douglasbagnall/p4wn and https://p4wn.sourceforge.net .. it's all javascript, heavily scrambled, so it fits into 5K, and it has graphics and mouse events .. it calculates very fast (within 1 sec) and plays rather good!

the concept you're bringing here, minimalism, is good .. since years i use this slogan as footer of all my emails : simple is not always best, but best is always simple .. by coincidence i was fascinated by your new 4KU2 and i started the "manual tournament" past day .. now i really feel 4KU2 is promising .. it seems these days you are integrating the basic well-known chess engine modules to build a new simple engine, and the result is just ne small .cpp file which can play chess! Hooray !

and can you supply a version which can do position fen ... ? :-)

tissatussa commented 1 year ago

you might be strongly focussed on the results of self play and other statistics, and do many fast games, but i'm interested in the way an engine plays .. are you a chess player? what is your rating? mine is about 1850 (in Holland) .. for me, one special case is the c-pwan : it can do wonders when advancing but does a good job overall on its starting square also .. it seems 4KU2 knows how to treat the c-pawn! Did you examine the games? The one against Dabbaba should have been a draw : 4KU2 messed it up at last in a simple utter drawn endgame .. you should not let 4KU2 continue searching for a win when nothing's there .. nevermind :-)

kz04px commented 1 year ago

can you supply a version which can do position fen ... ?

It's certainly possible to create a second version that will handle FEN strings, provide info strings, and other quality of life improvements. If that will ever actually happen or not, I don't yet know.

are you a chess player? what is your rating?

Very rarely and very casually. I'm not particularly good at the game. 4ku2 would beat me easily.

Did you examine the games?

Not fast games used for testing, but I do occasionally run the engine on lichess here and examine those games.

tissatussa commented 1 year ago

Not fast games used for testing

i mean the games of my ZIP .. they are 10m+3s, this is not "fast" .. this way 4KU2 can calculate properly, upto a normal depth.

kz04px commented 1 year ago

There are some patch ideas here, but ultimately everything has to test stronger before it can be merged. The gain also has to be worth the number of bytes added.

tissatussa commented 1 year ago

Now 4KU rating might be around 2400 ! I did several games 10m+3s with Simplex v0.9.8 (2403), TJchess v1.3 (2435) and good old Sjeng v11.2 (2450) .. and 4KU can beat them all ! (but not in all games).

tissatussa commented 1 year ago

..There are some patch ideas here, but ultimately everything has to test stronger..

regarding our Issue i want to show this position from a 4KU game: rnbqkbnr/ppp2ppp/8/3p4/8/5N2/PPPP1PPP/RNBQKB1R w KQkq - 0 4

4KU-dislikes-d4

Black is an (.jar) engine about 2556 rating .. it plays a French Defence and 4KU responds well, but then 4KU seems to dislike d2-d4 and it gets into 'closed' positions, lacking space without compensation and (even worse) piece harmony .. such games are lost from the very start !?

using the same time control, 4KU does not always play the same opening variations, but this position shows me 4KU does not value the board center properly - i guess that's my best way to explain the decisions it makes, see the blue and green arrows : 4KU plays c4 and sometimes Bd3, both moves have far less eval than the normal (?!) d2-d4 .. now Black plays d5-d4 himself ..

could your PSQTs be involved ?

tissatussa commented 1 year ago

in the meantime i managed to use cutechess GUI for examining 4KU in this position .. i use a .pgn 'openingbook' file having just 1 game upto this position .. this way cutechess GUI seems to feed the UCI command 'go startposition moves [..]' to 4KU and when i set playing times around 1 hour per game, 4KU prefers c2-c4 for almost 236 sec, when its best move is played : d2-d4 ! But the UCI info pane still shows c2-c4 at its last depth .. so, it seems 4KU changed its mind at the last moment .. maybe because both c2-c4 and d2-d4 eval are +0.14 at final reached depth !?

tissatussa commented 1 year ago

so, being a good chess player or not, can we conclude 4KU should seriously consider d2-d4 by recognising the d4 square is crucial for future center claims ? The question "d2-d4 or not d2-d4" arises .. and if yes, why not now ?

kz04px commented 1 year ago

It does seem that d2d4 appears at least briefly, but I don't know why 4ku seems to prefer the inferior move c2c4. I suppose after c2c4 d5d4 white has a backward pawn on d2, but this eval feature has been tested before quite recently to no gain in strength. Perhaps piece mobility is another idea here, but from what I hear it would cost a lot of space if added.

Personally I rather dislike how often f1d3 is showing up, as blocking in your central pawns like that is probably a bad idea most of the time.

info depth 1 score cp 48 time 0 nodes 35 pv b1c3
info depth 2 score cp 24 time 0 nodes 465 pv d1e2
info depth 3 score cp 60 time 6 nodes 4719 nps 786500 pv f1d3
info depth 4 score cp 21 time 14 nodes 10820 nps 772857 pv f1e2
info depth 5 score cp 76 time 24 nodes 26012 nps 1083833 pv f1e2
info depth 6 score cp 46 time 37 nodes 50199 nps 1356729 pv f1e2
info depth 7 score cp 68 time 85 nodes 135438 nps 1593388 pv f1e2
info depth 8 score cp 29 time 405 nodes 684226 nps 1689446 pv f1d3
info depth 9 score cp 33 time 695 nodes 1170654 nps 1684394 pv f1d3
info depth 10 score cp 33 time 1707 nodes 2893159 nps 1694879 pv f1d3
info depth 11 score cp 50 time 6777 nodes 11407940 nps 1683331 pv c2c4
info depth 12 score cp 27 time 18181 nodes 30744323 nps 1691013 pv d2d4 <~~~ here
info depth 13 score cp 27 time 39742 nodes 67473860 nps 1697797 pv f1d3
bestmove c2c4
tissatussa commented 1 year ago

one of the games a newer 4KU version played with White, was against some Scandinavian Defence of Simplex v0.9.8 .. it has a normal -non blitz- CCRL rating of 2400 but plays very bad .. 4KU easily won the game, despite a poor Kf1 and thus no connected Rooks, but 4KU is pretty and simple harmonious .. how can an engine like Simplex play so poorly ? It seems to lack all basic chess understanding, esp. doing many moves with the Queen.

simplex_high_rating

here's the game. with some music animation enjoy the arena :)

https://user-images.githubusercontent.com/1109281/206944270-96db5ef6-a94a-4b4f-83dd-203da2fdfb7b.mp4

tissatussa commented 1 year ago

but this eval feature has been tested before quite recently to no gain in strength. Perhaps piece mobility is another idea here, but from what I hear it would cost a lot of space if added.

nice, you investigated this .. how do you test an eval feature ? in this case it's mostly about pawn structure / movement and eg. backward pawns, no need to consider the movements of the other pieces .. but such pawn structure feature could take a lot of space also, i suppose .. might changing the new simple PSQTs be an idea ? I have no clue, just brainstorming ..

tissatussa commented 1 year ago

more food for thought

i once did a GitHub Issue of another engine, concerning an endgame position which might easily end with only 2 Knights and the Kings .. many engines evaluate this as clearly winning, while they're calculating and pruning, but we all know the KkNN game always draws .. so the engine should be aware of (this) basic "material relationship" .. Note: in chess, some KkNNp (the losing team has 1 pawn) positions exist where the pawn is just on the right line / column and the game can be won because of tempo !

8/p5pp/1pk5/5p2/P1nn4/2NN3P/5PPK/8 w - - 0 1

read the referred GitHub Issue at https://github.com/bagaturchess/Bagatur/issues/16

4kNightsPos

the 4N position White to move. It's an endgame from a fixed starting position, which was composed to induce all kinds of complications, while 'zugzwang' occurs because each player has only two knights and some pawns. Black has a plus pawn and should be able to win this position, but White can draw in many ways. A few basic chess principles are involved in this position : it shows how tempo, material and the threat of promotion intertwine.

I think this position does not deserve searching in great depth, better evaluate a wide spectrum of variations, including all kinds of sacrifices : maybe offer knights for pawn(s), especially the last black pawn, because even when black keeps the 2 knights he can not win without any pawn left on the board .. the engine should know this chess fact in this position !

tissatussa commented 1 year ago

regarding our Issue i want to show this position from a 4KU game: rnbqkbnr/ppp2ppp/8/3p4/8/5N2/PPPP1PPP/RNBQKB1R w KQkq - 0 4

The newest (# 58) 4KU plays d2-d4 in this position ! I think this is an important improvement. See this screenshot (i forced the position by the opening book option in CuteChess) :

4ku_plays_d4

tissatussa commented 1 year ago

your newest versions (upto #82) have rating about 2500, i guess .. i test in CuteChess with 128 Mb Hash and 10m+3s .. its openings are different, with White it mainly plays 1.e4 and sometimes Nf3 or Nc3 or even e3 ! I never encountered 1.d4 but when forcing this by an 1-move-opening book, 4KU just transposes into it main setups with Nf3 and e3 .. then indeed Bc1 is stuck for the time, that's a well known downside, 4KU almost never plays the Bishop out-of-the-pawn-chain, as it's called, and then e3 .. it prefers the initiative and can do sacrifices for that.

Btw. do those coming TCEC games have opening themes ?