Closed Zomis closed 1 year ago
Out of the three options, only replay of moves seem to make sense right now. However, a problem with this is that currently, replays are performed in suspending functions which makes it very hard to use forked games as part of rules.
Luckily, AlphaBeta AIs is part of the server code which makes it easy to run inside runBlocking
There is a slight conflict between "If you do this, you have to be able to do X later" and "Replay of moves"...
In order to look into the future, I copy the entire game and replays all moves. So I look into the past to look into the future, which will look into the past, which will look into the future, which will look into the past, which will look into the future, which will look into the past, which will look into the future...
This chain needs to be broken somehow, preferably by not looking into the future while looking into the past. Can the whole requires
part be skipped, and preferably also the forking itself, since we know that it is okay to do these actions because they have already been made? Can we wrap the entire forkGame
call in a forks
section or something, that only gets invoked if game is not being replayed?
In some cases regarding "If you do this, you have to be able to do X later", copying the entire game is unnecessary and copying only the relevant parts of the model to evaluate that is enough.
However, as a generic case, it's good if copying the entire game is still supported, so that's what I am aiming for right now.
One bug I encountered was that some of the forked games were not stopped. That has now been fixed.
In my test scenario I added a game where you can change a simple int value, which starts at a random number from -20 to 20. In order to be allowed to increase by 10, the forked game - which adds an increase of 40 - must have a value of less than 50. So in practice, the value needs to be below 10 in order to be allowed to increase by 10.
One edge case I found is that a coroutine was kept running. Which happened because the game had caused an exception to be thrown. Which happened because the action definition DSL statement was added twice. This happened because the list of action DSLs was not cleared in between, which it usually is. Which happened because one action was not allowed to be performed. Which happened because a clone didn't have the same starting randomness value as the original. Which happened because the replay hadn't been fully performed. Which happens when replays contain no actions. So the root cause is that replays are not correctly performed when they are not containing actions.
And the root cause mentioned above happens because of this statement in ReplayingListener
:
if (moves >= replayMoveCount) return
The edge-case is, of course, that moves == 0
and replayMoveCount == 0
.
Now at least the tests are stable, however, a new issue has appeared: When a game is forked, it doesn't take the most recently made action into consideration. This is because the code to send the feedback for lastAction
is in the start of GameFlowImpl.nextAction()
, which is executed after the DSL which creates the fork.
One possibility is to make a ActionPerforming
(or ActionAllowed
?) step, which is created and sent as soon as possible.
However, a bad thing about that approach is that all the replayable state stored in between steps will not be copied, the reason why the feedback is sent in the beginning of nextAction()
in the first place is because of #272
A simpler solution to the problem of not performing the last move is to send it the most recently performed move, which hasn't had its feedback sent yet but has been stored for future sending.
This will have the consequence of not fully replaying anything that is done after the fork happens, such as:
game.value += replayable.int("stepLoop") { Random.Default.nextInt(-2, 2 + 1) }
However, does it matter? I think it is okay, since it's impossible to copy a randomness that hasn't happened yet.
I was for a while thinking about if we should have a check to check if we're forked (The Good Place reference!! 😄), but I don't think that's needed.
The mess continues!
Randomness from start of game was moved to the beginning of a step
call, as the fork happens before the call to nextAction
.
However, when it is placed there the randomness from onStart
is not part of the replay data.
It would help significantly if the randomness from onStart
was treated in a more similar way to the randomness from performing actions.
Solution, naturally, was to treat the randomness from onStart in a more similar way to the randomness from performing actions.
Fixed thanks to The Big Refactoringâ„¢ #302
I broke the AlphaBeta AI's... That needs to be fixed.
AlphaBeta AI's were getting stuck while evaluating the best move. Around 8-10 forks were created but they just didn't proceed anywhere beyond the first feedback being sent. I assume there was nothing listening for that.
Getting rid of all runBlocking
statements helped with this.
Now however, it makes some illegal actions while replaying. For example:
java.lang.IllegalStateException: IllegalAction(actionType=play, playerIndex=0,
parameter={Pos 1, 1; Size 0, 0; Played by X. Parent is {Pos 0, 0; Size 3, 3; Played by NONE. Parent is null}})
was not replayed correctly after 2 actions. Full replay data is
ReplayData(gameType=DSL-TTT, playerCount=2, config=Configs([Config(m: 3 of class class kotlin.Int), Config(n: 3 of class class kotlin.Int), Config(k: 3 of class class kotlin.Int)]), initialState={},
actions=[
ActionReplay(actionType=play, playerIndex=0, serializedParameter=Point(x=1, y=1), state={}),
ActionReplay(actionType=play, playerIndex=1, serializedParameter=Point(x=0, y=0), state={}),
ActionReplay(actionType=play, playerIndex=0, serializedParameter=Point(x=0, y=2), state={})
])
AlphaBeta AI mess continued:
The illegal moves seem to happen because the same action is passed twice to the game. This happened because of a coroutine mess, but the root case seems to have been that replays was only incrementing the position
counter on ActionPerformed
, if it was moved to right before sending the action, that was fixed.
Then another problem appeared: Forks could not perform actions because of Illegal Action. This happens because the fork hasn't fully replayed all the steps yet. The fork says that it has a replay data of a move being made at (1, 1), and yet the fork game board is empty with no moves having been made.
The Replay class were using one variable: position
for two different things: actionsSent, actionsPerformed.
Splitting them up helped a bit but some edge-cases were still left, especially with more deep AlphaBeta AIs.
Re-working replays to only use one coroutine for sending all actions helped a lot, instead of launching a new coroutine every time.
AlphaBeta AIs are a lot slower when copying the entire game in this new way - instead of simply copying the model like they did before. But now they have proven to me that the copying and replaying works.
There exists a method like this in the
Game<T>
interface:fun copy(copier: (source: T, destination: T) -> Unit): Game<T>
This however forces each caller to define how to copy the game. It would be beneficial for multiple reasons to have a common solution for this.
Possible implementations can use:
In the first two options, a map needs to be kept to map old objects (e.g. players and references) to new ones. Remember the mistake from #274