lithander / Leorik

Leorik is a strong, open-source UCI chess engine written in C#
MIT License
25 stars 3 forks source link

Set engine strength #13

Open MasJonathan opened 8 months ago

MasJonathan commented 8 months ago

Is there a way to set the strength of the engine so that it plays at a weaker ELO?

lithander commented 8 months ago

Version 3 has a UCI parameter called "Temperature". If you set it to X the engine is applying a random bonus anywhere between 0 and X to each root move. This is making it more likely to play a move that is up to X worse than the best move but still prevents completely terrible moves like blundering a queen.

It's a new feature so if you try it let me know if it works for you! :)

MasJonathan commented 8 months ago

Hello, thank you for your answer. But I'm not sure I fully understand. If temperature=X, and x is in centipawns, it seems that temperature=1000 would completly destroy the engine evaluation. Can you elaborate about the unity of temperature and how you estimate how much it weakens the engine please ?

lithander commented 8 months ago

My above Elo estimations are based on selfplay matches against Temperature=0 version.

If all moves get a random bonus in the range [0..1000] then it is very unlikely that the worst move gets a bonus of 1000 while the best move gets a bonus of zero. They'll be much closer together in practice.

Also we assume "centipawns" for the evaluation unit but it's hard to calibrate an engine's eval for that because the value of a pawn is not very well defined. Compared with other engines Leorik is a bit of a drama queen, meaning that it will rate an unbalanced position with higher absolute values then most. So you could say Leorik's evaluation is not in centipawns but in halfcentipawns (an exageration!) and then temperature=1000 is again less destructive then what is intuitive.

In the end I can only tell you what the temperature parameter does and roughly how it effects the play but I didn't "calibrate" it in a way that you can set Elo levels directly. You can weaken the engine using the Temperature parameter but you can't say on what level the weakened engine plays without actually measuring it against other engines.

MasJonathan commented 8 months ago

Yes, i see, so to change the order of moves, the random eval should be greater than the diff of evals + r bonus.

So let's say we have only 3 possibles moves rated : 10, 50, 100. With a temperature=40 order can only change if 10+40, 50+0, 100+whatever. In this example, last move can't become first. And even with a temperature=100 each move is offseted from 50 on average, keeping the order most of the time, right ? It seems pretty good ! It's just hard to figure out pricesely how much it weakens the engine.

Do you think some deeplearning approach using games and elo of players would be efficient to build a tool able to estimate the elo of a game / players / engine ?