Open tissatussa opened 3 days ago
Amazing! Thank you :) I'm actually currently developing my own tournament runner and plan on automatically pulling the latest version of each chess engine and constantly updating it's elo as well as recording loads of useful information (average thinking time, nodes per second, depth searched etc.) It's going to be up on osccel.com (opensource computer chess engine league) pronounced /oʊ sɛl/ (oh-sell) when it's done, hopefully over the weekend.
Maybe you'd like to use that for your own experiments if it works for you, i'd be able to add the features you like.
V1.1.0 is about 30 elo stronger, the only major change was the increased network size, though I hope that with the next training round tonight it'll gain even more since there is so much more data! ~1.5bn positions
Amazing! Thank you :)
It's rare i spend that much 'effort' on one engine .. but in this case i also updated my engine archive a bit, using Sapling as kind of reference, but high as 132 i didn't expect - must be fun ..
I'm actually currently developing my own tournament runner and plan on automatically pulling the latest version of each chess engine and constantly updating it's elo as well as recording loads of useful information (average thinking time, nodes per second, depth searched etc.) It's going to be up on osccel.com (opensource computer chess engine league) pronounced /oʊ sɛl/ (oh-sell) when it's done, hopefully over the weekend.
i once created some terminal scripts in python to display search data when solvinf STS bm & am puzzles. You're creating all code in dot-net / C# ? Pitty, i'm only on Linux, and have trouble compiling such M$ code ..
Maybe you'd like to use that for your own experiments if it works for you, i'd be able to add the features you like.
your assets give no problems here !
V1.1.0 is about 30 elo stronger, the only major change was the increased network size, though I hope that with the next training round tonight it'll gain even more since there is so much more data! ~1.5bn positions
how do you create / select the training data ? and what about those many STS bm / am puzzles ? are they valuable ? how ? can we prove anything, or is it all statistics & self-play ? I'm tending to find positions to prove something, like "find the move within X seconds".
It's rare i spend that much 'effort' on one engine .. but in this case i also updated my engine archive a bit, using Sapling as kind of reference, but high as 132 i didn't expect - must be fun ..
Honestly I've been blown away by your support, you've really energized me to make the engine be the best it can!
i once created some terminal scripts in python to display search data when solvinf STS bm & am puzzles. You're creating all code in dot-net / C# ? Pitty, i'm only on Linux, and have trouble compiling such M$ code ..
Yeah, though the new system will probably have a decent amount of python too. If you're ever interested i'd be happy to help you get dotnet compilation working on linux! It should be fully supported, so probably just a build option you're missing.
how do you create / select the training data ?
There is a 'datagen' function you can run in Sapling which outputs a bullet format set of training data, in a nutshell it plays the first 9 moves randomly then plays itself and records the evaluation and result of all quiet positions to a fixed number of nodes.I usually generate around 1.5 billion positions to train a new network at the moment, I actually had to buy some new hardware for this because it's a lengthy process, usually taking 6 servers around 2 days for a new net! Though it's getting good enough now I can start to re-use data from the prev net. I then feed that data into a program called 'bullet trainer' which runs on the GPU to generate the weights, from what I can tell most people don't bother with this, they just use the 'Leeler' data set, which is absolutely amazing - but it doesn't give your engine a unique play style.
and what about those many STS bm / am puzzles ? are they valuable ? how ? can we prove anything, or is it all statistics & self-play ? I'm tending to find positions to prove something, like "find the move within X seconds".
Can you explain what you mean by this? what are 'STS bm / am puzzles' ?
nice info, thanks.
Can you explain what you mean by this? what are 'STS bm / am puzzles' ?
https://www.chessprogramming.org/Strategic_Test_Suite the 'WAC' file is famous, here's a newer version i once found : wac.zip 'am' is Avoid Move and 'bm' means Best Move. Look at the EPD syntax.
If you're ever interested i'd be happy to help you get dotnet compilation working on linux! It should be fully supported, so probably just a build option you're missing.
yes, i would really like to be able to use dot-net / C# compilation here .. indeed i must be missing something simple .. it's about conflicting dotnet versions, not in the PATH, or both .. i don't know .. and 'msbuild' is executed but doesn't exist .. i wished M$ never set foot on our base :-)
about STS : https://github.com/fsmosca/STS-Rating
https://www.chessprogramming.org/Strategic_Test_Suite the 'WAC' file is famous, here's a newer version i once found : wac.zip 'am' is Avoid Move and 'bm' means Best Move. Look at the EPD syntax.
That's awesome! Didn't know that existed but will give the suite a go tonight. Anything else you know like that please share, it's really useful.
yes, i would really like to be able to use dot-net / C# compilation here
Well idk if you're on discord but I'd be happy to jump on a call with you to help get it working, should take 10 mins max.
Anything else you know like that please share, it's really useful.
well, i wrote about 'Patricia', see https://github.com/Adam-Kulju/Patricia , a new and very aggressive engine .. the author mentions Stefan Pohl's EAS tool at https://www.sp-cc.de/eas-ratinglist.htm , which was a great help for him .. i'm not familiar with this tool but i guess you'll appreciate it.
About STS : there are many .epd files with such 'puzzles' .. not all are relevant any more, because many are from an older age when computers were much slower, so i guess you should select and judge them .. i gathered many of those files, i can ZIP some for you.
And i will mention the tool 'analyse-pgn' which i once found at https://github.com/mrdcvlsc/analyse-pgn .. see also an Issue there from me (i'm tissatussa) .. i remember this tool is OK, but i changed some code to suit my needs .. it's not for those .epd 'puzzles' but just to let an engine analyse a game, like the LiChess graphs (but analyse-pgn only gives textual output).
yes, i would really like to be able to use dot-net / C# compilation here *** Well idk if you're on discord but I'd be happy to jump on a call with you to help get it working, should take 10 mins max.
About my problems compiling dot-net code : i can use Visual Studio Code on Linux, but i don't understand that program .. i like to use terminal based scripts and commands to compile .. i have several dot-net versions (i remember 6, 7 and 8) but they're scattered around my OS file tree, probably by wrong installations / PATHs etc. .. some code C# packages having a .sln file require a specific dot-version to compile, and i often get many (differen) errors and warnings i can't solve .. i'm not new to Linux, i dare and can do extensive hacks under-the-hood, but this chapter is a hard one for me. And i don't like M$, i prefer (your) assets. I appreciate your help, but let's freeze this subject for now. I know Discord, but i'm not "on it" regularly .. but sometimes indeed got some great help there.
Another one (you got me triggered) is my rather old (2021) Issue regarding the Bagatur engine : 'develop NN', see https://github.com/bagaturchess/Bagatur/issues/16 .. it's a long one, with much info, not all may be relevant to you, but it's a nice read with lots of info & questions & ideas .. then and now Bagatur is a special creation, written in Java, but i no longer bother to compile & use it, because it consists of a bunch of files and settings to run it, the worst config i've seen so far ..
that 4N-position is interesting, read my description and comments in that Issue .. i just took that FEN and did a quick test with Sapling v1.1.1 : it has no problem winning with Black, even against the strongest engines, like SF 17.
And i just saw https://github.com/KierenP/Halogen/pull/573/commits/a224b9e9ced3f48a2bea96d7a772528a8673f04a .. it's just one of the many updates of the Halogen engine (NN, rating 3460) .. it uses 'fastchess' which seems to be a well-known and respected tool for engine development .. here an .epd file is processed .. i didn't work with fastchess though .. if you didn't know this tool exists, it might be another help.
Fascinated ..
While testing the recent Sapling versions (upto v1.0.5, see also my #4 and #6 ) i estimate its current rating 3100. At first i picked opponent engines with rating from 2500 and up, but it became clear Sapling is much stronger, it won almost all those games. So here are 132 5m+3s games (13 with Black and 119 with White) against engines with ratings 2900 upto 3400 : here Sapling will show its limits ..
download PGN : sapling-132-games-5m+3s.zip
(when 'ECO' is missing : the game had a custom starting position)
Most engines are 3000+ -- i also let a few really weak ones play, they obscure this list a bit ..
Ratings aren't shown in this table, although this info would make the list more relevant, but i use CuteChess GUI for tournaments (and normally just single games) and this program lacks the feature to give engines a rating, at least, this is unclear to me : when doing a tournament in CuteChess (GUI) a Result List is shown which has a column called 'Elo' .. then how is this value calculated ? Also see my recent Issue https://github.com/cutechess/cutechess/issues/824 on their GitHub page -- at this moment there's no reaction. Am i missing something ?
anyhow, here's the result of a 5m+3s Gauntlet tournament, Sapling v1.0.5 playing White against 30 (mostly) equal and stronger opponents, from start position. I added their ratings to the Result List. (Sapling won where Points is 0.0)
download PGN : Sapling-v1.0.5-Gauntlet-30x-5m3s.zip
i remember some calculation exists to determine the Gauntlet-engine rating according to such result list when all opponent ratings are known ? I guess v1.0.5 is 3100+.
what about v1.1.0 ? Is it much stronger ? Does it play differently ?