Timmoth / Sapling

A strong dotnet UCI Chess engine - My leaf nodes are growing
https://iblunder.com
Apache License 2.0
39 stars 2 forks source link

improper UCI option Hash #4

Closed tissatussa closed 1 month ago

tissatussa commented 1 month ago

you didn't properly code the UCI option 'Hash' :

$ ./Sapling_linux_x64-v1.0.2-asset 
uci
id name Sapling 1-0-2
id author Tim Jones
option name Threads type spin default 1 min 1 max 1024
option name Ponder type check default false
option name Hash type spin default 287
uciok

it should have a min and max (and default). current version can not be used by CuteChess (GUI) : the Hash value can't be set (skips to zero when adjusting it).

Timmoth commented 1 month ago

Thank you for alerting me! I will investigate and get an updated version out asap

tissatussa commented 1 month ago

..get an updated version out..

yes, it must be easy to fix, i could have done it .. i guess just adding those min and max will make CuteChess happy, that's all.

Timmoth commented 1 month ago

@tissatussa I've just pushed up a new version, along with it is a fix for the Hash UCI option, seems to work for me, would you mind testing it again to see if it's fixed for you please?

https://github.com/Timmoth/Sapling/releases/tag/Sapling-1.0.3

tissatussa commented 1 month ago

@Timmoth yes it works now ! Today i let Sapling v1.0.2 play many 5m+3s games against several others, using CuteChess GUI .. i estimate its rating 2600+

Timmoth commented 1 month ago

Amazing! cheers for that :) very interesting, hopefully i'll be able to push those numbers up in the coming weeks (v1.0.3 should be around 20 ELO stronger!). Do you host the results of your tournaments anywhere online I can see?

Also do you know if it also fixed https://github.com/Timmoth/Sapling/issues/5 for you?

tissatussa commented 1 month ago

..Do you host the results of your tournaments anywhere online I can see?

today i was somehow impressed by your engine and i did a gauntlet-by-hand : just one-by-one 5m+3s games against engines 2200 - 2750, climbing. Sapling won most of them, until Fruit v2.3 and others. So i guess 2700+ is more accurate. It has a weird style, really, and i like it .. hard to describe, very often it does not do what i was thinking, time after time .. it constantly wants to exchange some advantage for another, not taking the bite but threatening more .. and it reaches rather high depth - congrats! Here are 31 games : sapling-v1.0.2&3-against-others-5m3s.zip last 11 have start FEN rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPPBPPP/RNBQK1NR b KQkq - 1 2 :

sicilian-Be2

recently i saw this opening : 2. Be2 ?! against the Sicilian Defence .. i remember f4 should be played, and d3, but i'm not sure about the role of Nf3 here .. Sapling plays 3.c3!? after 2...d6 but 3.Nf3 when 2...Nc6 happens. The 3.c3 games are nice but weird to me : also Be3-f3 is done and leads to unknown complexities.

Also do you know if it also fixed #5 for you?

yes !

btw. pitty for me : Sapling is dotnet and i have trouble compiling that, being on Linux .. so i use your asset

tissatussa commented 1 month ago

about the Hash default : why 287 ? Normal numbers are 256 and 128 and 512 etc. ?

Timmoth commented 1 month ago

That's amazing @tissatussa thank you for taking an interest in Sapling I really appreciate it!

WRT unique playing style, I can think of three contributing factors:

  1. The evaluation network was generated from self play from random weights vs using someone else's (leelas) data, that many engines use.
  2. The engine was developed in dotnet, which requires quite different optimisation strategies that influence the overall architecture compared to engines commonly developed in cpp / rust
  3. The engine has quite an aggressive set of pruning strategies. I've found more aggressive strategies gave better results early in development and that's kind of stuck.

The hash size defaulting to 287 is because internally the transposition table must have a size (number of elements) that is a power of two. This is due to the way the hash function works (the number of bits to use to mask a positions Zobrist hash). This number must then be multiplied by the number of bytes needed to store a transposition entry, thus giving a slightly unnatural default size. When you specify a precise transposition size such as 256mb, the actual memory used will be close but not exactly 256mb. If that was confusing and you'd like me to explain more i'd be happy to!

tissatussa commented 1 month ago

i've never encountered 287 as default .. and i think the user shouldn't bother .. we can set any number, but maybe you should mention how to set it (optimally), or create some pre-defined amounts in a selectbox like 287, 574 (?) etc. Just my idea ..

tissatussa commented 1 month ago

..and that's kind of stuck.

that part i don't understand .. what do you mean ?

tissatussa commented 1 month ago

i like the agressive approach .. do you know Patricia engine ? It's "the most agressive bunny" or so : https://github.com/Adam-Kulju/Patricia .. your engine is "only" v1.0.x but it gives no errors and plays strong, i guess training the network (that way) is most of the work ? Interesting !

Timmoth commented 1 month ago

I'll happily conform to the norm with this stuff, so will do the math internally and set the default to 256, I may be wrong with this but I don't think many engines will be able to respect the hash given exactly I think the memory used will always be rounded down or up some amount (depending on if their TT element size is exactly divisible by the desired TT size)

What I meant by that is that I had more success early in development by using aggressive pruning techniques, as in it was frequently able to beat itself when it was more eager to prune the search tree. This may not equate to more aggressive play style. But it does mean that it will become fixated on a certain chain of moves and search them more deeply, as opposed to searching a broader set of moves at a shallower depth.

Patricia seems super interesting, i may reach out to the dev and ask a few questions, thanks for the link!

P.S training the network is 0.0000001% of the work, I just leave it playing against itself for hours, mostly just requires a lot of compute power & patience! But from what I've heard, using your own dataset to train the network is a large contributing factor to your engines behaviour / personality. If you train using someone else's dataset you can expect your engine to evaluate positions equally, producing a very similar play style, the only difference then being how deep you can search.

tissatussa commented 1 month ago

Wow, that's nice info !

Are any "HCE rules" still involved ? Or is the eval only determined by the NN ? If a bit HCE, which rules ?

tissatussa commented 1 month ago

and what about my position ? Could it ever be an idea to train "one section of the NN" to play against the Sicilian Defence this way ? Same for (rather many) other 2-move setups from the opening. Is that realistic ?

tissatussa commented 1 month ago

another engine doing 'agressive' moves : https://github.com/amchess/Alexander/releases/tag/2.0 .. it says :

Removed opening variety and replaced with variety which (..) - psychological more risky - inviting the engine to a game with psychological/human sacrifices : strictly speaking incorrect, but almost impossible to refute.

Timmoth commented 1 month ago

Wow, that's nice info !

Are any "HCE rules" still involved ? Or is the eval only determined by the NN ? If a bit HCE, which rules ?

Unfortunately not, I got rid of the HCI before moving to NNUE, there is something special about HCE, Lynx engine is a particularly strong engine that uses hci in dot net that I know of

Timmoth commented 1 month ago

and what about my position ? Could it ever be an idea to train "one section of the NN" to play against the Sicilian Defence this way ? Same for (rather many) other 2-move setups from the opening. Is that realistic ?

Very interesting train of thought, the current architecture I'm using defines 'output buckets' which there is not a lot of information about that I could find online. Essentially it alters the weights associated with certain features (features being synonymous with the rules you'd define in a hci) depending on certain conditions, in my case it is the material value on the board. So the evaluation will change dynamically from the opening to endgame.

There is also a concept of input buckets that I'm yet to implement that further customises the network to the current position, though I'm not sure of any that can identify and change according to the specific opening, may be worth some pondering!

Timmoth commented 1 month ago

another engine doing 'agressive' moves : https://github.com/amchess/Alexander/releases/tag/2.0 .. it says :

Removed opening variety and replaced with variety which (..) - psychological more risky - inviting the engine to a game with psychological/human sacrifices : strictly speaking incorrect, but almost impossible to refute.

Very interesting keep sending any engines you think are interesting, I'm very eager to take a look!

tissatussa commented 1 month ago

Eg. here are 7 "start positions" when the engine plays White and (always?) opens with 1.e4 : the opponent can play many good moves, i picked 7 familiar ones and imagined a reply : from those positions a dedicated NN should be trained !? When Black plays another move (like 1...Nf6) the NN logic should play just as well, probably indirectly entering known grounds.

Caro-Kann e4 c6 : Bc4

caro-kann

French e4 e6 : c4

french

Pirc : e4 d6 : a3

pirc

Nimzo : e4 Nc6 : Nf3

nimzo

Modern : e4 g6 : h4

modern

Sicilian : e4 c5 : Be2

sicilian

Equal : e4 e5 : Nc3

equal

This could be a real setup, but for now it's an example to illustrate my idea. Probably it's not bad i know little about NN's and their training .. easily makes me think out-of-the-box ..

Timmoth commented 1 month ago

Eg. here are 7 "start positions" when the engine plays White and (always?) opens with 1.e4 : the opponent can play many good moves, i picked 7 familiar ones and imagined a reply : from those positions a dedicated NN should be trained !? When Black plays another move (like 1...Nf6) the NN logic should play just as well, probably indirectly entering known grounds.

Caro-Kann e4 c6 : Bc4

caro-kann

French e4 e6 : c4

french

Pirc : e4 d6 : a3

pirc

Nimzo : e4 Nc6 : Nf3

nimzo

Modern : e4 g6 : h4

modern

Sicilian : e4 c5 : Be2

sicilian

Equal : e4 e5 : Nc3

equal

This could be a real setup, but for now it's an example to illustrate my idea. Probably it's not bad i know little about NN's and their training .. easily makes me think out-of-the-box ..

The idea of a custom NN for different openings is certainly an interesting concept to explore. My current understanding is that It shouldn't make that much difference if any, and actually it's the role of an opening book to get the board into a state with enough entropy for the engine to take hold. Since the engine will only be using the Nn to produce a static evaluation on leaf nodes (hopefully at a depth of >25) it should be way out of the opening at the point the nn is used.

Thinking conceptually it could be possible to alter the pruning depending on the opening though I'm not actually sure how that would work!

Timmoth commented 1 month ago

P.S I just released a new version with some rather large refactoring to the main data structures and the memory management, should make at least a +30 elo difference !

tissatussa commented 1 month ago

I just released a new version..

yes, and i'm still letting Sapling v1.0.x play against increasingly stronger engines, now it seems Sapling can even beat 3000+ most of the time .. so must adjust my rating estimation once more :-)

Timmoth commented 1 month ago

Please keep me posted on your findings!

tissatussa commented 1 month ago

Thinking conceptually it could be possible to alter the pruning depending on the opening though I'm not actually sure how that would work!

well, try it .. Sapling would be the first engine with such concept !?

as soon as i completed my gauntlet-by-hand (mostly 5m+3s games) i will post those games here.

tissatussa commented 1 month ago

100 x Most games are 5m+3s from starting position (except when no ECO given) : sapling-v1.0.x-games.zip

it seems Sapling v1.0.x can even beat 3000+ ! (but not always)

Sapling-v1 0 x-games

tissatussa commented 1 month ago

Also Sapling v1.0.1 recently joined a tournament, see https://chessengines.blogspot.com/2024/09/hypnos-220824bis-and-private19sf0xe.html :

2024 09 17 NewEnginesTest2

Timmoth commented 1 month ago

@tissatussa That's great! About to release a new version with an increased net size from 768 to 1024 hidden layers, either tonight or tomorrow. Given v1.0.4 is considerably stronger then v1.0.1 i'd be interested to see how 1.0.5 compares!

Hopefully these new releases will stabilise in the next week, but since i'm getting great feedback from yourself and the community it's still got lots to tweak!

tissatussa commented 1 month ago

..Given v1.0.4 is considerably stronger then v1.0.1 i'd be interested to see how 1.0.5 compares!

when using an increased net size, i would call this version v1.1.x if it's really stronger, i will setup a gauntlet with some 3000+ engines. and probably i will start a new Issue for it (i closed this one).