Confused about MakeMove and UndoMove

Markusgp commented 1 year ago

Hello!

I've implemented Minimax with Alpha-Beta pruning to determine the best moves, however my bot sometimes does very silly things and I am a bit puzzled about board.MakeMove().

If I make a call to my recursive Minimax function after a call to board.MakeMove(m) will it actually retrieve the updated board state as input, or will it use the input board from the input parameter in the Minimax function itself?

So for example this code which is inside the Minimax function:

           ```
            board.MakeMove(m);
            (float eval, Move _) = Minimax(board, depth - 1, false, alpha, beta);

            // Checking if the evaluation of this move is better than the current maxEval
            if (eval > maxEval)
            {
                maxEval = eval;
                bestMove = m;
            }
            board.UndoMove(m);
            ```

Will my Minimax evaluation call at line 2 of the code snippet retrieve the new board state after calling board.MakeMove(m)? I am using this approach to evaluate all the possible child positions from a certain board state, but if the board state isn't properly updated, this will completely fail of course.

Thanks a lot! Fun challenge so far :)

TheSecurityDev commented 1 year ago

MakeMove should update the board object, so yes it should work.

penguinsrepic commented 1 year ago

Make sure that it is your move you are testing, since if you use MakeMove earlier it will then be the enemy's turn and you have to do that turn before checking your further possible moves. If you are already doing that then I'm not sure what the issue could be.

mcthouacbb commented 1 year ago

What "silly things" does your bot do? Is it hanging pieces/tactics, or does it just make a random-looking move that doesn't actually lose any material.

Markusgp commented 1 year ago

If Minimax is maximizing, for each possible move in a position I call board.MakeMove(m), then I evaluate using Minimax recursively for all child positions, and call board.UndoMove() before I start alpha-beta pruning.

If Minimax is minimizing, I do the same.

So these two scenarios are the only times I make calls to board.MakeMove() and board.UndoMove().

In general it makes quite good moves, but then maybe 20% of the time makes a complete blunder which just outright hangs a piece of any value.

Shouldn't minimax with alpha-beta pruning with a depth of 5 be able to distinguish an immediate recapture next move even if the heuristic is just a value sum of the board position based on piece values?

Rotkip commented 1 year ago

If Minimax is maximizing, for each possible move in a position I call board.MakeMove(m), then I evaluate using Minimax recursively for all child positions, and call board.UndoMove() before I start alpha-beta pruning.

If Minimax is minimizing, I do the same.

So these two scenarios are the only times I make calls to board.MakeMove() and board.UndoMove().

In general it makes quite good moves, but then maybe 20% of the time makes a complete blunder which just outright hangs a piece of any value.

Shouldn't minimax with alpha-beta pruning with a depth of 5 be able to distinguish an immediate recapture next move even if the heuristic is just a value sum of the board position based on piece values?

Maybe it's because of the Evilbot? I thought that minimax "thinks" that the opponent also chooses the best moves. I don't really know a lot about it

Markusgp commented 1 year ago

@Rotkip

Minimax is actually so GigaChad that even a sneaky EvilBot should get pwned even with a weak heuristic.

Lets say you're playing white, Minimax will iterate over all the possible moves you can make as white, and then also determine the best responses black can make to any of those potential moves. After evaluating all these different possibilities to some depth, - if implemented correctly - it will always choose the move for white which goes down the path of best possible future moves for white, and worst future moves for black.

So its really quite smart, and won't get confused by a bot which plays unoptimally, in fact it will play even better.

RedBedHed commented 1 year ago

Assuming your Minimax is good, you are probably suffering from the horizon effect.

Theoretical Minimax is indeed gigachad. It gives the definitive perfect evaluation of any position in the Chess game tree. It will find the principal variation and unequivocally prove the value of a position.

However, Minimax in practice is not so gigachad.

Because there are more nodes in the game tree than there are atoms in the universe, (and enough unique positions to comprise the matter of a large planet in atoms) Minimax search for chess must be depth limited.

So there is a horizon. A ply that the search cannot not see past. We must approximate the value of nodes here with a "static" evaluation. The values we backpropagate are estimates of what a deeper search would show... But the only true tactical evaluation is a deeper search, and static evaluation-- try as it might-- cannot see past the horizon. The search, then, cannot prove the value at the root and there is no hope of finding the definitive principal variation (Otherwise Chess would be solved like tic tac toe and no one would be writing Chess engines).

But not all is lost. Although it is completely unreliable at loud positions (positions where tactical moves can be made), static evaluation is somewhat reliable at quiet positions (moves where no tactical "attack" moves can be made). A linear sum of the material and positional weights can describe the value of these positions with an acceptable level of accuracy.

So we make it our goal to only evaluate quiet positions.

How do we do this? The answer is Quiescence Search. The idea is very simple: Stand pat with a static evaluation. If it raises alpha, make a bet that this evaluation is good and set alpha to the eval, otherwise bet on alpha. If the evaluation happens to fail high, just return beta, otherwise, try out ONLY attack moves to try to improve alpha (now either the static eval or itself). If there are no attack moves, or if we can't do better than alpha, just return alpha. Otherwise return the improved value between the original alpha and beta, or beta in the case of a fail high. This is called the fail-hard implementation. You can easily change it to fail soft.

Here is the basic pseudocode: https://www.chessprogramming.org/Quiescence_Search

Try it out! It should fit in the 1024 chars. You will also want to add code to allow Qsearch to escape check as well (try out all moves at a node), handling draws and checkmates.

RedBedHed commented 1 year ago

Also, if it helps, most people use Negamax instead of Minimax, because it is much easier to write and understand.

In the Negamax context, at any node, alpha is the minimum score that the current player is guaranteed while beta is the maximum score that the opponent will allow the current player to achieve. It is very much a window wherein lies the true score of the node. This is why, when we fail high we can say that the node is too good for the current player and abandon it. We have proven that the current player has ways to avoid this node.

RedBedHed commented 1 year ago

Also, Rotkip is right, Minimax assumes that both players play optimally to its depth, which can be exploited. However-- at 30 plies with extremely good NNUE "static" eval-- no one, no one can beat Stockfish (food for thought lol).

RedBedHed commented 1 year ago

To play optimally in Chess is to minimize one's own maximum loss.

Markusgp commented 1 year ago

@RedBedHed That's incorrect.

Minimax will iterate through all possible moves to a certain depth. Therefore it will also consider the "random" or "unoptimal" move that MIN can play to that depth and assign a value (win chance for MAX) to the branch. If MIN plays one of the unoptimal moves which MAX had disregarded as it obviously made things EVEN worse for MIN, MAX will just be super happy about MIN's extremely poor decision making skills and win even harder in that branch.

The whole point of Minimax is for MAX to always maximize their outcome with regards to MIN's possible responses, and for MIN to do the opposite. If MIN plays an unoptimal or random move, MAX will have a response which is guaranteed to be at least equal or better than the potential best move it had if MIN played the optimal move instead.

(Thanks by the way for the great response about horizon effects and quiescence search, I will look into it!)

Markusgp commented 1 year ago

Code_HTLODrHBUm

Am I mistaken or does this show that the MakeMove function doesn't give me a new board state?

RedBedHed commented 1 year ago

I see you, but what I said is not incorrect. What you said is true of theoretical Minimax, not of depth limited Minimax. It is a well known exploit to help beat shallow depth chess engines... playing the unexpected but advantageous-- not totally random-- move. Another exploit is Kasparov's idea to tempt the shallow search with material sacrifice.

RedBedHed commented 1 year ago

Am I mistaken or does this show that the MakeMove function doesn't give me a new board state?

It does give a new board state. Sharpy (my bot) uses the function and plays a decent game with it.

SebLague / Chess-Challenge

Confused about MakeMove and UndoMove #234