Documentation about developing an AI player?

magefree / mage

Magic Another Game Engine

http://xmage.today

MIT License

1.84k stars 760 forks source link

Documentation about developing an AI player? #10154

Open gcoter opened 1 year ago

gcoter commented 1 year ago

Hi there, I would like to develop a new AI player. Is there any documentation about this? I looked for it on the Wiki but could not find anything. Thanks in advance!

JayDi85 commented 1 year ago

Unfortunately, there is no detailed documentation on how the AI works. All you can do is take the current implementation and try to modify it.

XMage is server side app with all game and rules logic on server side only. So there are two possible points to inject AI:

Server side AI with full game access. All current AI implementation uses it, see ComputerPlayer7.java for actual version. Also current implementation uses full game simulation to process possible action and look ahead, see createSimulation;
Client side AI with restricted access about game (it's like a text only mode). See implementation of CallbackClient. There aren't any AI implementation but you can find few ideas:
- LoadCallbackClient.java -- simple client side client to process and control some actions. Uses for load tests (see below).
- CallbackClientImpl.java -- current client side client for GUI, but it uses code to collect some useful info like json logs, see appendJsonEvent.

What can you do:

for AI code:
- modify current ComputerPlayer7 with new AI decision code and see results (in unit tests or in real server);
- extends current ComputerPlayer7 with your new one class, override some methods to inject AI decisions and enable it by config.xml file (how-to enable new AI);
- create new one class and enable it with config (see above);
for AI testing:
- play with AI on the server (see above how-to enable AI by config);
- write normal unit tests with new AI, see usage and implementation of CardTestPlayerBaseWithMonteCarloAIHelps. It allows setup game use cases and call AI action one time for one simple situation;
- write load tests with new AI running on real server, see LoadTest.java and test_TwoAIPlayGame_Multiple -- it's a best way to run multiple AI games and collect some data from it.

The faster way to start with your own AI code:

create new class like ComputerPlayer7AI extends ComputerPlayer7;
change config.xml className="mage.player.ai.ComputerPlayer7" to className="mage.player.ai.ComputerPlayer7AI";
override some choose( and chooseTarget( methods in ComputerPlayer7AI and play AI game on local server to see results.

gcoter commented 1 year ago

Hi @JayDi85, thank you so much for this detailed answer! I will give it some thoughts and take a look at ComputerPlayer7 as it is apparently the current reference for AI implementation :slightly_smiling_face:

gcoter commented 1 year ago

Why is it called ComputerPlayer7? Is it like the 7th version? What about the other classes? Are they deprecated?

JayDi85 commented 1 year ago

Why is it called ComputerPlayer7? Is it like the 7th version? What about the other classes? Are they deprecated?

Yes, it's like 7th version, but some older version were removed from the code.

XMage uses Player interface to work with game's players. So:

PlayerImpl implements Player -- base player implementation with shared logic for all human and AI;
HumanPlayer extends PlayerImpl -- human player, e.g. it's works with client side app, send and receive commands by network;

Non AI players:

StubPlayer extends PlayerImpl -- empty player with minimum logic, uses for some tests;

Base AI players:

ComputerPlayer extends PlayerImpl -- base AI implementation with all choose dialog logic and responses, all AI related classes use it. Must be used in all AI classes for good game processing (if something wrong with choice logic implementation then it can ruin/freeze a whole game).

Inner AI players:

SimulatedPlayer extends ComputerPlayer -- outdated and unused, current version is SimulatedPlayer2
SimulatedPlayer2 extends ComputerPlayer -- mock AI player (needs to do nothing in game simulations).
Additional info about game simulation: xmage choose one or more actions to test (example: take 5 possible cards to cast), replace all players by simulated players and process that action with default choices until full stack resolve or until timeout. After complete -- it calc a battlefield score and select action with best one. The more AI level -- the more timeout for the simulations.

Tests AI players:

TestComputerPlayer extends ComputerPlayer -- use simple AI player for unit tests to execute test commands;
TestComputerPlayer7 extends ComputerPlayer7 -- use real AI player for unit tests with full games simulations;
TestComputerPlayerMonteCarlo extends ComputerPlayerMCTS -- use Monte-Carlo AI player for unit tests;

Real AI players:

ComputerPlayerMCTS extends ComputerPlayer -- outdated AI with Monte-Carlo algorithm. Computer - monte carlo in the config.xml (it also uses some custom classes like MCTSPlayer or SimulatedPlayerMCTS);
ComputerDraftPlayer extends ComputerPlayer -- mock AI player for drafts (needs to do simple choices in draft dialog). Computer - mad in the config.xml. XMage uses it to replace disconnected humans in draft too;
ComputerPlayer6 extends ComputerPlayer -- improved AI implementation with game simulations (for choose dialogs it uses simple logic from base class, for priority actions it uses game simulations to find a best score move);
ComputerPlayer7 extends ComputerPlayer6 -- latest AI version with additional improved;

gcoter commented 1 year ago

Hi @JayDi85, thank you very much for the information, I think I understand now :slightly_smiling_face:

gcoter commented 1 year ago

Ideally, I would like to take a base class which implements the basic logic (which is necessary for the AI to work with the game) and override a subset of methods which are purely related to strategy. However, I struggle to identify this subset.

For instance, I understand that chooseMulligan can be used to change the strategy to keep or not the starting hand cards. choose on the other hand looks more generic and I struggle to understand its meaning.

In your opinion, what are the methods which are really related to strategy (i.e. their implementation offers degrees of freedom to change the behavior of the AI)? Which ones on the contrary should not be overridden (i.e. they are necessary for the class to work with the game)?

Is there some documentation about the different method like choose for instance (what is it supposed to do? what is it supposed to return?)?

Sorry I ask many questions, I find the project very interesting but I need some help to get started :sweat_smile:

JayDi85 commented 1 year ago

There are two different and important block of code/logic for AI:

playPriority -- AI must decide what to do next (cast spell, activate ability, skip). So it combine all possible abilities and possible targets, simulate games and choose best "action + targets" combination to cast/activate in real game;
choose and chooseTarget -- strict logic to choose a targets without game simulation. The main idea behind it:
- choose methods for choosing actions in the resolve (paper: non targeted);
- chooseTarget methods for choosing real targets in the ability cast/activate (paper: targeted);
- But some methods/calls can mix it (it's a "bug" and must be fixed someday, unit tests uses setChoice and addTarget for it).

I don't know what you want to do with AI. My suggestions:

pickCard -- change pick logic in drafts (current version uses some manually and auto-calculated rating for the cards, see RateCard.java). You can learn AI to take good cards in the draft. Need some score system, draft simulation codes (see LoadTest.java for example), etc.
selectAttackers, selectBlockers -- change a declare attackers and blockers logic. Can be a very useful feature. Current AI can't simulate and choose complex combination of targets, see AI problem example: #4485

JayDi85 commented 1 year ago

All choose methods must setup targets in the Target param (all game objects has unique identifier e.g. UUID id, so it used as a targets/choices). Game engine check it and don't allow to setup an incorrect data. But there are possible bugs or rare use cases, so AI able to "hack" it and set data like "one creature attacks multiple targets", see example: #7434

JayDi85 commented 1 year ago

You can look at TestPlayer.java and the most important methods for the game (search by @Override). There are few basic methods. You can use same approach in the new AI -- if something goes wrong then skip and call super method:

shot_230403_014306

gcoter commented 1 year ago

Ok, thank you very much :slightly_smiling_face:

Here is some context about what I would like to do. I am wondering whether we could introduce Machine Learning to develop more powerful AIs for MTG. I work in this field and I am aware of several successes like AlphaGo (and its more advanced versions), AlphaStar, and some others which basically use Reinforcement Learning to improve upon classical MCTS. MTG is a quite challenging game but I think that, at least in theory, we could use similar technics.

In order to do that, I need to define 3 things:

The state of the game (i.e. all the information that the AI knows at a given stage in the game)
The action space (i.e. all the possible actions at a given stage in the game)
A reward system (i.e. a measure of how well the AI is doing, it could simply be something like "-1" if it lost the game and "1" if it won or something more complex)

Going straight for a complete RL-based AI is very ambitious, a more realistic strategy is to take one existing AI implementation and define sub-problems to solve. For instance, it could be to learn a score for the pickCard method as you proposed. I think I would like to start doing something "simple" like this.

I also have one technical requirement: I must be able to run thousands of simulated games in order to collect data and train the AI. Most of the state-of-the-art RL algorithms are based on self-play (the AI plays against itself and learn from it). Do you think it would be possible to do something like this? For instance, would it be possible to write a code which simulates thousands of games of ComputerPlayer7 playing against itself and record all the states and actions in a file?

If it was possible, then I could use this data for training and I could create a subclass of ComputerPlayer7 which calls the trained model in pickCard instead of using the scores of RateCard.java for instance.

And then, if all of this is a success, we could extend this to other sub-problems, creating additional models :slightly_smiling_face:

JayDi85 commented 1 year ago

The state of the game (i.e. all the information that the AI knows at a given stage in the game)

Already has it:

GameState -- full game state with all objects and active effects. XMage uses it for rollback and other inner features. You can look to GameState restoreState( for details about GameState restore usage.
GameView -- full game state for client side (e.g. text mode). It's fully serializable and can be stored on the disk.

The action space (i.e. all the possible actions at a given stage in the game)

Already has it:

List<ActivatedAbility> getPlayable( for all possible actions to activate/cast (it calcs all possible mana combination and all possible abilities to activate, but without some rare dynamic target restriction effects).
ManaOptions getManaAvailable( -- it's inner method to find all possible mana usages
Target->possibleTargets( and Target->canTarget( -- each choose method contains Target object, so it can be used to find and work with valid targets

A reward system (i.e. a measure of how well the AI is doing, it could simply be something like "-1" if it lost the game and "1" if it won or something more complex)

Already has it:

PlayerEvaluateScore -- there are some methods to calc a game state score for the player (more score -- best result for it), can be used without game simulations. It works with Game object, but I think it can migrate to GameState instead too.

I must be able to run thousands of simulated games in order to collect data and train the AI. Most of the state-of-the-art RL algorithms are based on self-play (the AI plays against itself and learn from it)

Already has it. See LoadTest for examples (e.g. test_TwoAIPlayGame_Multiple):

It can run AI vs AI game, collect GameView data and a final result (for GameState data you must inject your code to AI player);
It play multiple parallel games on real server (local server) very fast. But some games can be freezes due infinite AI moves (due bugged cards, see #5023 or due looped meaningless moves, see #6839);
Current implementation uses random seed and auto-generated decks, so you can see a game seed in the logs and "repeat" it on next run (well, it will repeat a deck generation results, but a real game run can be different). Game engine uses shared random source, so without timeouts limits in game simulations it can repeat same game result with same random seed in theory.

Do you think it would be possible to do something like this? For instance, would it be possible to write a code which simulates thousands of games of ComputerPlayer7 playing against itself and record all the states and actions in a file

You can inject your code to collect that data, see client side example with GameView and appendJsonEvent. Same for server side AI (but another places to inject like boolean priorityPlay()

There is a very important thing about GameState or GameView objects -- you can't apply some commands and transform one version of GameState to another. Game engine can makes multiple changes to the GameState in one game cycle. See discussion here #9619 One average game can generates 30-80 MB of GameView's data (99% of data are not changed between game cycles, so it can be optimized someday by diff logic instead full).

XMage uses thousands of unit tests to run and test real games. So if you need a server side AI and data then it can be simulated without real server. If you don't need client side AI/data then I recommends to run your games by test framework instead real server.

gcoter commented 1 year ago

Indeed, everything seems to be already there, thanks again :slightly_smiling_face: I guess I will start by running test_TwoAIPlayGame_Multiple to see how it works and then adapt it so collect data.

I tried running test_TwoAIPlayGame_One, I get an error:

ERROR 2023-04-04 21:33:18,554 monitor_6139: Unknown callback: SHOW_USERMESSAGE, [Join Table, You (ai_1) have an invalid deck for the selected Freeform Commander Format. 

Commander: Sideboard must contain only the commander(s) and up to 1 companion
Deck: Must contain 100 cards: has 42 cards
Farewell: Too many: 2
Kairi, the Swirling Sky: Too many: 2
Lion Sash: Too many: 2
Satoru Umezawa: Too many: 2

Select a deck that is appropriate for the selected format and try again!] =>[ThreadPool(1)-1] LoadCallbackClient.processCallback

I understand that it has a problem with the randomly created deck. How can I solve this? I can create a separate issue if you prefer.

JayDi85 commented 1 year ago

LoadTest improved in a9f1e15168a575d641cecdcdbf15f9c3224e5c9f and 81d9c099fb8ec071d09c00e76b424dfab0bdd6f9. Now it works fine.

Commander: Sideboard must contain only the commander(s) and up to 1 companion

Yes, it's a wrong deck generation. It must be rare/impossible error in auto-generated decks. I recommends to use simple deck colors like "RG" (red and green colors are good for ai decks, cause blue/black cards can be too complicated for use).

Look at the start of the file, it have some default settings:

TEST_AI_GAME_MODE and TEST_AI_DECK_TYPE -- allows to change simulated game and deck types (see config.xml for all possible names);
TEST_AI_RANDOM_DECK_SETS -- allows to use random decks from a specific set. It can be useful to test newly developed set with thousand AI games in test_TwoAIPlayGame_Multiple or test_MultipleGames to catch card errors or freezes;
TEST_AI_CUSTOM_DECK_PATH_1 and TEST_AI_CUSTOM_DECK_PATH_2 -- allows to set your own deck file (xmage's deck) instead random generated.

BTW there are two different load tests:

test_TwoAIPlayGame_Multiple -- run multiple AI games with client side feedback one by one (monitor user will catch all GameView object changes);
test_MultipleGames -- run multiple AI games on server side only without any feedback, can be run parallel.

jeffwadsworth commented 2 months ago

Just placing this combat combination code here for anyone that wishes to use it. I have used this for a few years on my local copy and it works fine...but going anywhere over 4 attackers on 4 blockers (1, 717 combinations) will have to wait for quantum computers to be realized in 1000 years if ever. It would be beautiful if the code supported multiple cores/threaded as that would speed up the process. As an aside, something like Claude 3.5 can evaluate the game state pretty well without all this work, though it has insane compute the work with.

This would be called via something like: List<List> combinations = CombinationGenerator.generateAllCombinations(attackers, blockers);

`public static class CombinationGenerator { private final Map<String, List<List>> memo; private final List attackers; private final List blockers; private final List<List> combinations; // Store the combinations

    public CombinationGenerator(List<Permanent> attackers, List<Permanent> blockers) {
        this.memo = new HashMap<>();
        this.attackers = attackers;
        this.blockers = blockers;
        this.combinations = new ArrayList<>();  // Initialize the list for combinations
    }

    public List<List<Permanent>> generateCombinations() {
        List<List<Permanent>> blockerSubsets = generateBlockerSubsets();
        generateCombinations(attackers, blockerSubsets, 0, new ArrayList<>(), new HashSet<>());

        // Include combinations with individual attackers
        for (int i = 0; i < attackers.size(); i++) {
            List<Permanent> singleAttacker = Arrays.asList(attackers.get(i));
            generateCombinations(singleAttacker, blockerSubsets, 0, new ArrayList<>(), new HashSet<>());
        }

        return combinations;
    }

    private List<List<Permanent>> generateBlockerSubsets() {
        List<List<Permanent>> subsets = new ArrayList<>();
        generateBlockerSubsets(0, new ArrayList<>(), subsets);
        return subsets;
    }

    private void generateBlockerSubsets(int idx, List<Permanent> currentSubset, List<List<Permanent>> subsets) {
        if (idx == blockers.size()) {
            subsets.add(new ArrayList<>(currentSubset));
            return;
        }
        generateBlockerSubsets(idx + 1, currentSubset, subsets);

        currentSubset.add(blockers.get(idx));
        generateBlockerSubsets(idx + 1, currentSubset, subsets);
        currentSubset.remove(currentSubset.size() - 1);
    }

    private void generateCombinations(List<Permanent> attackers, List<List<Permanent>> blockerSubsets, int current, List<List<Permanent>> result, Set<Permanent> usedBlockers) {
        if (current == attackers.size()) {
            addCombinationToList(attackers, result);
            return;
        }

        for (List<Permanent> subset : blockerSubsets) {
            if (Collections.disjoint(subset, usedBlockers)) {
                String key = generateKey(current, subset);
                if (!memo.containsKey(key)) {
                    List<List<Permanent>> permutations = generatePermutations(new ArrayList<>(subset));
                    memo.put(key, permutations);
                }

                for (List<Permanent> permutation : memo.get(key)) {
                    result.add(permutation);
                    usedBlockers.addAll(permutation);
                    generateCombinations(attackers, blockerSubsets, current + 1, result, usedBlockers);
                    usedBlockers.removeAll(permutation);
                    result.remove(result.size() - 1);
                }
            }
        }
    }

    private List<List<Permanent>> generatePermutations(List<Permanent> subset) {
        List<List<Permanent>> permutations = new ArrayList<>();
        generatePermutations(subset, 0, permutations);
        return permutations;
    }

    private void generatePermutations(List<Permanent> arr, int index, List<List<Permanent>> result) {
        if (index >= arr.size() - 1) {
            result.add(new ArrayList<>(arr));
            return;
        }

        for (int i = index; i < arr.size(); i++) {
            Collections.swap(arr, index, i);
            generatePermutations(arr, index + 1, result);
            Collections.swap(arr, index, i);
        }
    }

    private String generateKey(int current, List<Permanent> subset) {
        return current + "-" + subset.toString();
    }

    private void addCombinationToList(List<Permanent> attackers, List<List<Permanent>> result) {
        List<Permanent> combination = new ArrayList<>();
        for (int i = 0; i < attackers.size(); i++) {
            combination.add(attackers.get(i));
            combination.addAll(result.get(i));
        }
        combinations.add(combination);
    }

    public static List<List<Permanent>> generateAllCombinations(List<Permanent> attackers, List<Permanent> blockers) {
        CombinationGenerator generator = new CombinationGenerator(attackers, blockers);
        return generator.generateCombinations();
    }
}