py-stockfish / stockfish

Integrates the Stockfish chess engine with Python (Official fork)
https://py-stockfish.github.io/stockfish/
MIT License
29 stars 8 forks source link

get_top_moves() not giving weaker moves with lower elo #35

Closed hwuebben closed 1 year ago

hwuebben commented 1 year ago

Greetings,

I'm trying to limit the strength of the engine by setting a target elo like so: engine = stockfish.Stockfish(parameters={'UCI_Elo': elo, "UCI_LimitStrength": True})

but the engine still seems to play perfectly even if I set elo to 100 (or 1350).

hwuebben commented 1 year ago

Is this a version problem? Is there any combination of py-stockfish and stockfish version where this parameter works?

johndoknjas commented 1 year ago

Hi @hwuebben, in the stockfish library (currently what's on PyPI), boolean parameters are represented as strings. So if you do this, it should work:

engine = stockfish.Stockfish(parameters={'UCI_Elo': elo, "UCI_LimitStrength": "true"})

This is rather unintuitive, and recently here in the py-stockfish repo, we updated these parameters to be actual booleans. So if you were to use the models.py file from our master branch and import it directly in your project, the code you gave should work fine. Sooner or later we should be given access to push this py-stockfish repo to PyPI, so eventually this will be fixed for the library on that end too.

hwuebben commented 1 year ago

Hello @johndoknjas, thanks for the reply. I tried giving "UCI_LimitStrength" as string but it didn't change anything. I also tried engine.update_engine_parameters({"UCI_Elo": elo,"UCI_LimitStrength": 'true'}) and engine.set_elo_rating(elo)

best regards

kieferro commented 1 year ago

@hwuebben Could you share your entire code which calls the engine and also includes the evaluation part (like get_eval or get_top_moves)? This would make it easier to reproduce the problem.

hwuebben commented 1 year ago

@kieferro The entire code is a bit difficult but here is the short version:

engine = stockfish.Stockfish(r'stockfish-windows-2022-x86-64-avx2.exe', parameters={'UCI_Elo': 300, "UCI_LimitStrength":'true', 'Debug Log File':'fishie.log'})

...

engine.set_fen_position(fen)
engine.get_evaluation()

...

engine.set_fen_position(fen)
moves = engine.get_top_moves(3)

...

I did also log engine.get_parameters() and correctly stayed like this throughout the game:

{'Debug Log File': 'fishie.log', 'Contempt': 0, 'Min Split Depth': 0, 'Ponder': 'false', 'MultiPV': 1, 'Skill Level': 20, 'Move Overhead': 10, 'Minimum Thinking Time': 20, 'Slow Mover': 100, 'UCI_Chess960': 'false', 'UCI_LimitStrength': 'true', 'UCI_Elo':300, 'Threads': 1, 'Hash': 16}

I also tried different SF Versions

I attach the stockfish log file. fishie.log

Edit: I can see in the log that 1350 is the minimum elo but it doesnt play like that either. I compared several games with lichess stockfish and they match very closely.

johndoknjas commented 1 year ago

Hi @hwuebben, could you try giving it this fen: 1r1qrbk1/2pb1pp1/p4n1p/P3P3/3P4/NB4BP/6P1/R2QR1K1 b - - 0 1. This is the one I'm using to test if a weaker setting is being used.

hwuebben commented 1 year ago

@johndoknjas

import stockfish

engine1 = stockfish.Stockfish(
    r'stockfish_14_win_x64\stockfish_14_x64.exe',
    parameters={'UCI_Elo': 1350, "UCI_LimitStrength": "true", 'Debug Log File': 'fishie1.log'})
engine2 = stockfish.Stockfish(
    r'stockfish_14_win_x64\stockfish_14_x64.exe',
    parameters={'Debug Log File': 'fishie2.log'})

engine1.set_fen_position('1r1qrbk1/2pb1pp1/p4n1p/P3P3/3P4/NB4BP/6P1/R2QR1K1 b - - 0 1')
engine2.set_fen_position('1r1qrbk1/2pb1pp1/p4n1p/P3P3/3P4/NB4BP/6P1/R2QR1K1 b - - 0 1')
print(engine1.get_evaluation())
print(engine2.get_evaluation())
print(engine1.get_top_moves(3))
print(engine2.get_top_moves(3))

{'type': 'cp', 'value': -4} {'type': 'cp', 'value': 0} [{'Move': 'd7c6', 'Centipawn': 12, 'Mate': None}, {'Move': 'd7f5', 'Centipawn': 44, 'Mate': None}, {'Move': 'f6h5', 'Centipawn': 125, 'Mate': None}] [{'Move': 'd7c6', 'Centipawn': -28, 'Mate': None}, {'Move': 'd7f5', 'Centipawn': 0, 'Mate': None}, {'Move': 'f8b4', 'Centipawn': 86, 'Mate': None}]

different evaluations but top 2 moves are the same

hwuebben commented 1 year ago

Here the same prints with Stockfish 15.1 (above was 14):

{'type': 'cp', 'value': 9} {'type': 'cp', 'value': 0} [{'Move': 'd7c6', 'Centipawn': 0, 'Mate': None}, {'Move': 'd7f5', 'Centipawn': 34, 'Mate': None}, {'Move': 'f6h7', 'Centipawn': 65, 'Mate': None}] [{'Move': 'd7c6', 'Centipawn': 0, 'Mate': None}, {'Move': 'd7f5', 'Centipawn': 31, 'Mate': None}, {'Move': 'd7c8', 'Centipawn': 98, 'Mate': None}]

here even the evaluations are almost equal

johndoknjas commented 1 year ago

@hwuebben Yeah looks like the full strength version is being used in both cases, since d7c6 is the strongest move. This is curious, not sure why it is.

hwuebben commented 1 year ago

@johndoknjas can you reproduce the problem or is it just me?

hwuebben commented 1 year ago

@johndoknjas @kieferro okay so I just check this:

print(engine1.get_evaluation())
print(engine1.get_best_move())
print(engine1.get_top_moves(3))

print(engine2.get_evaluation())
print(engine2.get_best_move())
print(engine2.get_top_moves(3))

and output was:

{'type': 'cp', 'value': 9} d7f5 [{'Move': 'd7c6', 'Centipawn': -9, 'Mate': None}, {'Move': 'd7f5', 'Centipawn': 33, 'Mate': None}, {'Move': 'f6h7', 'Centipawn': 57, 'Mate': None}] {'type': 'cp', 'value': 0} d7c6 [{'Move': 'd7c6', 'Centipawn': -1, 'Mate': None}, {'Move': 'd7f5', 'Centipawn': 32, 'Mate': None}, {'Move': 'd7c8', 'Centipawn': 77, 'Mate': None}]

seems like limit strength isn't working with get_top_moves.

johndoknjas commented 1 year ago

@hwuebben Nope, for some reason I can't reproduce the problem on my end.

cgreening commented 1 year ago

I think I'm seeing the same thing. I'm using:

    stockfish.set_elo_rating(elo)

I don't think it's having any effect.

johndoknjas commented 1 year ago

@hwuebben @kieferro Ah ok, I was testing get_best_move(), not get_top_moves(). get_top_moves indeed gives me an issue, even with the current models.py of the py-stockfish repo here. This issue is because of how we parse the stockfish output:

image

This is the output from the command line Stockfish, when setting elo to 1350 and calling get_top_moves(5). The five lines starting with "info depth" represent each of the top 5 moves Stockfish is considering (at its standard strength). However, then at the last line for picking a bestmove, stockfish randomly decides whether to pick a weak move, due to its elo being 1350 (how random depends on how low the elo I guess). In this screenshot's particular run, the f8a3 move it picked wasn't even in the top 5.

The way our get_top_moves function works is by reading the top n moves, starting with "info depth". However, what should be done when the elo is lowered (and UCI_LimitStrength = True) is reading the "bestmove" line given by Stockfish. This can be fixed, but the problem is that get_top_moves doesn't really work when wanting to find multiple moves at a low elo. Stockfish only outputs one "bestmove" line, so we don't know its 2nd, 3rd, etc "bestmoves" at the low elo.

So I guess we'd have to decide how to deal with calls to get_top_moves in this scenario. Should we only allow the num_top_moves param to be 1 when the elo is lowered?

cgreening commented 1 year ago

Yep, I had a chat with the stockfish guys on Discord and came to the same conclusion.

I think maybe it's just a case of documenting the behaviour? So get_top_moves is the top moves irrespective of skill or elo and get_best_move is the best move for the current skill level.

It would be interesting to understand how the best move is selected based on elo - but that would involve digging into the stockfish code...

johndoknjas commented 1 year ago

@cgreening Yeah I think that would make sense to do.

hwuebben commented 1 year ago

@cgreening @johndoknjas at the very least a Warning should be thrown when attempting to call get_top_moves with UCI_LimitStrength set.

johndoknjas commented 1 year ago

@hwuebben I considered doing that, but an issue might be if the user wants to know what full strength SF's top moves are, and then what weaker SF chooses for its "bestmove". In this case they should be allowed to call the function.

In #38 I've updated the readme though, to make it clear that get_top_moves only returns full strength moves.

hwuebben commented 1 year ago

@johndoknjas sure but a warning is just a warning. You can suppress it if you are aware of the issue or you can just ignore it

johndoknjas commented 1 year ago

@hwuebben Fair enough, I'll add this to the PR for get_top_moves, and also get_evaluation (similar story there where SF doesn't output weaker setting evaluations).