google-deepmind / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
Apache License 2.0
4.23k stars 932 forks source link

multiplayer games with hidden actions #126

Closed peter-de-boer closed 4 years ago

peter-de-boer commented 4 years ago

I am interested in applying, and contributing to, this framework and have some questions and remarks!

My background: I am a data scientist, app developer and a boardgame enthusiast. As a hobby project, last year I have implemented a web-app, an online version of a multiplayer strategic boardgame, and now I am think about developing an AI for it, so players can choose to play against AI bots instead of against other players. I might implement more games in the future as well. I guess this framework is a great starting point for me.

Some information on the game. The game is named Ponzi Scheme (more info on: [https://www.boardgamegeek.com/boardgame/180899/ponzi-scheme]()). Some key characteristics: economic trading game, hidden actions, a few chance events, three or more players.

The implementation of the game is in Python (on the the server side). I think it won't be too difficult to adapt the implementation to the openspiel API. Most important change will be handling the chance events, i.e. adding a chance player instead of handling chance events as game state changes as a result of player actions.

First question: is there an interest that I add this game to this github? If so, I am happy to do so. I have understood the games here are implemented both in Python in C++. Is a C++ implementation mandatory? Also happy to do that, but it will take time.

I am planning to try different algorithms but maybe someone can give some advice. I am familiar with some algorithms like MCTS, minmax, DQN. But several others in the list here are rather new to me. Which algorithms are most suited for my type of game (i.e. 3+ players, hidden info).

The feature I am most concerned about is the hidden information. I see there are several games with hidden information implemented here, but as far as I can tell most of them deal with information which is hidden at the start of a playthrough (like dealing cards face down), and is possibly revealed during the playthrough. In my game, some actions are hidden to some players (players may make secret trades with each other; the amount of money involved is hidden to players not involved in the trade). So during the whole playthrough there is basically more and more information hidden. In the games listed here, I think only phantom tic-tac-toe has hidden actions, but maybe I have overlooked one or the other game with that characteristic. Anyway, which algorithms are suitable for games with hidden actions?

lanctot commented 4 years ago

Hi @peter-de-boer,

Ponzi Scheme sounds like a great game. I would love to have it in the code base!

The implementation of the game is in Python (on the the server side). I think it won't be too difficult to adapt the implementation to the openspiel API. Most important change will be handling the chance events, i.e. adding a chance player instead of handling chance events as game state changes as a result of player actions.

You can also look into the different chance modes. kSampledStochastic might be easier for a large game (basically the chance node allows arbitrary stochastic events that need not be enumerated).

First question: is there an interest that I add this game to this github?

Absolutely. Are you affiliated with the designers or publisher in any way, though? For games that are not public domain we might need permissions. (We might not, but we may have to look into it.)

If so, I am happy to do so. I have understood the games here are implemented both in Python in C++. Is a C++ implementation mandatory? Also happy to do that, but it will take time.

It is not mandatory but almost all the games are in C++. You can do it in python-- it would just mean that you would only be able to run the python algorithms on it (which is fine if you are not too concerned about efficiency).

I am planning to try different algorithms but maybe someone can give some advice. I am familiar with some algorithms like MCTS, minmax, DQN. But several others in the list here are rather new to me. Which algorithms are most suited for my type of game (i.e. 3+ players, hidden info).

I would recommend the imperfect information RL algorithms that have come out over the past few years: NFSP, PSRO, RPG, DeepCFR, or NeuRD. They are all described and cited in the OpenSpiel paper here: https://arxiv.org/abs/1908.09453

We will soon have an implememtation of Information Set MCTS (IS-MCTS) which is another good place to start. It has been used for AI in card game apps.

The feature I am most concerned about is the hidden information. I see there are several games with hidden information implemented here, but as far as I can tell most of them deal with information which is hidden at the start of a playthrough (like dealing cards face down), and is possibly revealed during the playthrough. In my game, some actions are hidden to some players (players may make secret trades with each other; the amount of money involved is hidden to players not involved in the trade). So during the whole playthrough there is basically more and more information hidden. In the games listed here, I think only phantom tic-tac-toe has hidden actions, but maybe I have overlooked one or the other game with that characteristic. Anyway, which algorithms are suitable for games with hidden actions?

There is also imperfect information Goofspiel, but you are right that we don't have many in this category, but they are fully supported. All the algorithms I suggested above will work on them. The only place that partially observable actions are separated from other kinds of hidden information is in the definition of the game's observations or information states. All the algorithms are agnostic to the source of the hidden information and operate entirely via the observations as exposed from the game. So it's not really a special case.. at least not from the algorithms' perspective.

Hope this helps.. let me know what you think!

peter-de-boer commented 4 years ago

Thanks @lanctot, it is really helpful!

Regarding chance mode, yes, kSampledStochastic is an option and more close to my current implementation. But I actually like the kExplicitStochastic mode more. It allows for replaying games just by storing the actions.

Generally, I believe the explicit mode makes more sense in search algorithms. Sampling according to probability distributions raises the question (to me at least): which probability distribution? The probability distribution might involve hidden information, so the probability distribution from the point of view of the player whose turn it is might be different from the probability distribution from the point of view of the all-knowing game State. I believe the explicit mode is more flexible in general, for example, an AI algorithm might estimate probability distributions and this seems more feasible if the sampling is not embedded in the State class.

Regarding the permissions. I have got the permission from the designer to implement the online game for my app. For implementing it here, I will ask permission again.

Board game mechanisms cannot be copyrighted (but board game names and pictures can). So, there is always a workaround by naming the game differently, give new names to game objects, etc. But it is nice to have an explicit permission anyway.

Regarding Python/C++. Good to know that everything is possible in Python. That allows me to start with Python, get everything working, and as soon as performance issues start playing a role I can switch to C++.

elkhrt commented 4 years ago

This would be a great contribution. A couple of small points, just in case:

Given Python's duck-typing, there's no need to inherit from Game and State base classes in your game implementation - you can just implement the right methods and things will work.

We could implement pybind11 trampolines as base classes so that Python game implementations can be supplied to C++ algorithms, but since we have implementations of most algorithms in both C++ and Python, there doesn't seem much point to this - where high-performance is needed, everything will be in C++.

A smaller version which still has all the key mechanics is very useful in development and testing of algorithmic ideas. This can be a custom-designed game (e.g. tiny_bridge / leduc poker), or a parameter to choose the game size (e.g. hex which works on any board size).

lanctot commented 4 years ago

Closing this for now but please feel free to reopen if you have more questions!