magefree / mage

Magic Another Game Engine
http://xmage.today
MIT License
1.8k stars 749 forks source link

DEV: Game engine improvement dicussion #9619

Open Alex-Vasile opened 1 year ago

Alex-Vasile commented 1 year ago

(I am making a separate issue to discuss and collect all of the game engine improvement work since #8973 is not the right place for it).

Problem

Game engine cannot current "rollback" a state. All it can do is store copies of the current state in memory and load them back if needed. This has two problems:

  1. It's incredibly slow and inefficient. It's also part of the reason why games with AIs can significantly slow down. The AI copies the state 10s-1,000s of times every time it gets priority.
  2. This copying can cause the whole server to crash in easy to reach scenarios (see #9302).

Cause

The way that rolling back currently "undoes" an action is by copying the state before the action is performed, and if an undo is needed, to restore that copied state. In order to avoid this copying, the actions (e.g. an effect resolving) would need to have an "undo" method. However without the full history of the game, such an undo is not possible.

Proposed change

Change the game engine that instead of keeping only the current state, it keeps track of the entire history of the with all of the information required to replay the game entirely. What I mean by this is that the new state of the game should be serializable so that it can be saved to disk, and then re-loaded at any point in the future (with the same version of XMage) and the game can be picked up from any point (either to continue with whoever has priority next, or to roll it back all the way to the first turn) and continune playing.

The operation of all methods/classes which interact with the new game state would function as follows: old state + operation -> new state + side effect (displaying things to screen or writing to file).

In order for this to work, I currently think the following changes are necesary:

  1. Change the datastructure of the GameState object to keep track of the history of all entries. This has to be done in a way that keeps track of actions performed in the game. E.g. each action occuring in the game (tapping a land, player dying, state based action, object put on stack) has a counter state before it occured and one after. So that rolling back to the start of the turn involves removing all changes which occured after the counter representing the start of the turn.
  2. All objects must be refactored so that the game state contains their internal state. The objects themselves can still have state, but it should only be thought of as a cache for the values contained inside the GameState.

Links:

  1. Delta Compression
  2. Design Patterns for Games StackOverflow
JayDi85 commented 1 year ago

There is no need to do extra work and complicate the code. The xmage engine cannot be rewritten from state engine to commands engine. See our discussion in the https://github.com/magefree/mage/issues/8973#issuecomment-1262711843

JayDi85 commented 1 year ago

BTW AI performance problem is not a copy problem -- it's logic and decisions optimization problem. No need to take all possible actions to execute (e.g. no need to simulate every possible use case) -- it must filter and optimize a possible actions or remember and take previous calculations.

Alex-Vasile commented 1 year ago

BTW AI performance problem is not a copy problem -- it's logic and decisions optimization problem. No need to take all possible actions to execute (e.g. no need to simulate every possible use case) -- it must filter and optimize a possible actions or remember and take previous calculations.

It's true. Both the cost of each computation, and the number of computations for the AI, and for the engine are issues. The AI is the worst case example since it uses the state copying the most, but the same issues show up for regular gameplay

See work I was trying as part of that in #9225 to add a way to check if two objects are equivalent for de-duplicating.

Alex-Vasile commented 1 year ago

There is no need to do extra work and complicate the code. The xmage engine cannot be rewritten from state engine to commands engine. See our discussion in the #8973 (comment)

I opened this to continue the discussion here since there was a lot going on there.

But to that discussion. Nothing there shows why it cannot be done.

awjackson commented 1 year ago

Magic is a vastly more complex game than the ones discussed in those StackOverflow threads, with an open-ended and continuously growing list of unique game pieces (cards). Most of the "rules" of the game are in fact defined by those game pieces themselves rather than by a central rulebook (more of the CR consists of instructions for interpreting the text on cards than of "rules" per se). Cards can be stateful in arbitrary ways (e.g. "as {this} enters the battlefield, choose a {color, type, object, anchor word}") and can refer to previous states of the game in arbitrary ways (e.g. abilities that only work if {arbitrary event} has occured N times in {arbitrary span of time}).

Alex-Vasile commented 1 year ago

Magic is a vastly more complex game than the ones discussed in those StackOverflow threads, with an open-ended and continuously growing list of unique game pieces (cards). Most of the "rules" of the game are in fact defined by those game pieces themselves rather than by a central rulebook (more of the CR consists of instructions for interpreting the text on cards than of "rules" per se). Cards can be stateful in arbitrary ways (e.g. "as {this} enters the battlefield, choose a {color, type, object, anchor word}") and can refer to previous states of the game in arbitrary ways (e.g. abilities that only work if {arbitrary event} has occured N times in {arbitrary span of time}).

Firstly my original statement was not correct and I've amended it. It's not that the objects (e.g. an ability) can't have a state, but that the state should be stored inside the GameState.

As for the rest of the comment, I agree that Magic has a more complex set of rules than the other types of games mentioned. But I don't see why that means that a different approach is not possible.

JayDi85 commented 1 year ago

Nothing there shows why it cannot be done.

Because there are a million places (literally) where game state data changes. You want to process all that changes for write history.

Example:

  1. You want to take a GameState, doubles inner fields to x2 to store a change history for each field;
  2. You want to take each ability/effect and rewrite it to remove all inner fields from ability/effect's class to GameState object;
  3. And it's only part of the work (example: AI, inner game engine classes, etc).

But all this has already been implemented. There is even serialization, including writing to disk with potential recovery to continue the game. It's was a game replay feature in old days (it's was buggy and disabled due GameState copy and other miss fields problem).

Conclusion:

So too much work and too small benefits. Some useful and related features instead:

Alex-Vasile commented 1 year ago

So i'll reply in two parts.

1. The argument for needing changes to game engine. I disagree with the overall argument that there are no benefits to improving the performance of the game engine.

Conclusion:

  • What's new for users? Nothing. You can implement replays and game continues by current engine.

The major benefit for the players are increased game speed, especially for big boards (and commander games in particular), or deal with the class of performance problems like #9302 which can bring down the entire server.

  • What's new for developers? Nothing. Moreover it's reduce opportunities for developers, see above about ability/effect rewrites. Even more -- you adds a new layer for a potential bugs (if something miss from game history logic).

I don't understand why it would reduce opportunities for the developers. But acknowledge that any changes will require developer time (which is in short supply and already spread thin given the size of the project).

  • What's new for server? Maybe. Or maybe not. All that "history magic" for all changeable data needs server's resources too. Can it gives more performance? Maybe. Or maybe not (example: ManaOptions from #9233 -- it gives boost in specific use case, but the overall performance remained the same, see tests times in travis build history). Can it gives a traffic optimization? Yes. It's a big problem for xmage and the MAIN performance issue for users in online games (see #2812) -- but it can be implemented and optimized with current GameState.

9233 Did indeed provide only minor benefits to the overall performance. That's because the issue was not often the bottleneck in game speed. It's why I eventually gave up on fixing the double-for loop used in removeFullyIncludedVariations.

I find it hard to believe that performance would not increase, but it's true that it's not guaranteed. That being said, #9233 is not a good comparison since the issues I am discussing are the bottlenecks. You can see the bottlenecks for yourself by getting a trial version of IntelliJ ultimate profiling the test suite. Or a game started on a local server.

As for #2812 being the main performance issue for online games, I am extremely skeptical. Based on some initial testing, the networking (especially the zipping and unzipping that goes along with the transfer) does use up a lot of reasources, but it is not as big as the resource usage caused by game state copying. And it quickly becomes a much smaller % as soon as a single AI is involved, it's why AI games are not enabled on the beta server anymore...

That being said the bandwith usage is extremely large, and it alone justifies a change. That being said, I think that #8806 should be dealt with before moving to #2812 since the optimization should be done on the network code that will be used going forward, rather than on the one that we wish to replace

So too much work and too small benefits. Some useful and related features instead:

  • json game logs for third party systems and AI learning: #4515;
  • game replays: #4515;
  • serializable drafts/tourneys to load and continue in any times (serializable games are also possible, but more complex): #1769;
  • network optimization to send only changed data for clients: #2812

I agree that the things on this list are important. Personally, the performance is more important to me since I enjoy playing wider decks and multiplayer formats (and hosting local games).

Alex-Vasile commented 1 year ago

2. The argument for the one approach I argued for above.

Because there are a million places (literally) where game state data changes. You want to process all that changes for write history.

Example:

  1. You want to take a GameState, doubles inner fields to x2 to store a change history for each field;
  2. You want to take each ability/effect and rewrite it to remove all inner fields from ability/effect's class to GameState object;
  3. And it's only part of the work (example: AI, inner game engine classes, etc).

But all this has already been implemented. There is even serialization, including writing to disk with potential recovery to continue the game. It's was a game replay feature in old days (it's was buggy and disabled due GameState copy and other miss fields problem).

It's true that the approach I initially proposed would have a large refactoring cost. Perhaps even too high given the very large benefits that I think it would provide (I still plan to quantify the current state after #9640).

I have thought about it some more, and in light of the comment here I am cosidering a different approach. The real issue with the state copying is that many of the objects being copied are never used. Most are not even queried, much less modified. So an alternative approach would be as follows, inspired by the Collections.unmodifiableList() functionality in the Java standard library.

Taking an Ability as the example. Create a wrapper class CopyOnWriteAbility which has 2 internal states:

  1. A variable storing an Ability.
  2. A boolean indicating if the ability refers to the original ability that CopyOnWriteAbility was instantiated with (i.e. is a shallow copy), or if the ability is a copy of that inidial abiltiy (i.e. is a deep copy).

CopyOnWriteAbility then implements all of the same methods as Ability, in which it simply passes along the call to the ability and returns any results.

Changes needed for this:

  1. Changing the way that abilities access internal data so that method calls which change the internal state can be seperated from those which do not change state. (This can be enforced by having the getter methods return unmodifiable versions of their return values)
  2. Writing CopyOnWriteWrappers for each ability. These will be drop in replacements for the existing class usages, so the entire engine does not have to be re-written, and the approach can be tested/implemented one class at a time.
Alex-Vasile commented 1 year ago

Let's talk numbers.

I ran test_PalaceJailer1. This test involves casting palace jailer, targeting a Silvercoat lion with the ETB ability, and letting the ability resolve. The player also becomes the monarch. The test ends at the beginning of combat on the first turn. This is a pretty simple test, and is representative of a Turn 2/3 play (the board is very small, play 1 spell, and not much interaction).

As part of the profiling, I had the test run 3,000 times. Over those runs, the program as a whole allocated/dealocated ~22GB of ram while test_PalaceJailer1 itself (without any of the setup code, e.g. creating the deck reseting the game before each run) used up 14 GB.

14 GB over 3000 runs is a 4.67 MB/run. Which doesn't seem like a lot, but lets look at what that ram is being used for. I'm attaching a screenshot of the flamegraph that you can generate with IntelliJ if you profile the code.

Memory

The width of the image represents 100% of test_PalaceJailer1's memory usage, and the width of each function's is proportional to it's ram usage. I have overlaid two colors on the flame graph.

The red areas represent the memory usage by GameImpl.bookmarkState (i.e. by the copying of the whole game state). And the blue areas represent the memory usage by Battlefield.reset.

Those two function calls easily make up 80% of the total memory usage. Most of it is not needed.

Look at the Batlefield.reset.

reset

Most of that copying is not needed, at least not at deep copies. In the majority of cases, most permants have the same stats as on the cards, and even when something does change, it isn't every ability. Let's zoom in on one of those Battlefield.resets. The biggest sources here are WhiteManaAbility.copy(), PlayLandAbility.copy(), and SpellAbility.copy(). How often are those changed? The basic mana abilities could propably be made static.

The game is spending gygabytes of RAM copying things that we can know did not change and do not need a deep copy.

Merlingilb commented 1 year ago

I am sorry if someone already posted this idea but while reading the original post about needing a history for better rollbacks I had this idea in mind and wanted to share it with you.

The problem of rollbacking (or undoing) a state and/or redoing operations that lead to the same state is something, that is used in databases a lot. Most modern databases have a redo/undo log with which they can undo or redo operations with quiet good performance. The idea I had was to use a database for the gamestate. A pro could be that you don't need to implement a complex redo/undo mechanism. A con could be that the database as game state could be slower than the current game state.

I hope my idea wasn't a complete waste of time and if there are flaws in that please tell me. I am very interested in this topic.