dickreuter / neuron_poker

Texas holdem OpenAi gym poker environment with reinforcement learning based on keras-rl. Includes virtual rendering and montecarlo for equity calculation.
MIT License
612 stars 168 forks source link
gym-environment holdem neural-network openai-gym poker pokerbot reinforcement-learning

Neuron Poker: OpenAi gym environment for texas holdem poker

This is an environment for training neural networks to play texas holdem. Please try to model your own players and create a pull request so we can collaborate and create the best possible player.

Usage:

Run:

.. figure:: doc/table.gif :alt:

Analysis of the run


At the end of an episode, the performance of the players can be observed via the summary plot.
|image0|

Packages and modules:

main.py: entry point and command line interpreter. Runs agents with the gym. The docstring at the top of the file describes the command line options. They are interpreted by docopt.

gym_env


-  ``env.py``: Texas Hold’em unlimited openai gym environment &
   ``rendering.py``: rendering graphics while playing

agents

Please add your model based agents here.

Note that the observation property is a dictionary that contains all the information about the players and table that can be used to make a decision.

Custom implementation of q learning

Custom impelemtation of reinforcement learning. This package is now in a separate repo:
www.github.com/dickreuter/tf_rl

tools

tests ^^^^^

Roadmap

Agents


- [x] Agent based on user interaction (keypress)
- [x] Random agent
- [x] Equity based strategy (i.e. call and bet above threshold)
- [x] Equity based strategy with genetic algorithm, adjusting the treshold based on winning agent.
- [x] C++ implementation of equity calculator to significantly speed up runs
- [x] Agent based on reinforcement learning with experience replay (Deep Q learning, based on keras-rl)
- [/] Custom agents (see above section for more details)

Reinforcement learning: Deep Q agent

neuron_poker.agents.agent_dqn implements a deep q agent with help of keras-rl. A number of parameters can be se:

Training can be observed via tensorboard (run tensorboard --logdir=./Graph from command line) |image2|

How to contribute

Launching from main.py


In ``main.py`` an agent is launched as follows (here adding 6 random
agents to the table). To edit what is accepted to main.py via command
line, simply add another line in the docstring at the top of main.py.

.. code:: python

    def random_action(render):
        """Create an environment with 6 random players"""
        env_name = 'neuron_poker-v0'
        stack = 500
        self.env = gym.make(env_name, num_of_players=6, initial_stacks=stack)
        for _ in range(num_of_plrs):
            player = RandomPlayer(500)
            self.env.add_player(player)

        self.env.reset()

As you can see, as a first step, the environment needs to be created. As a second step, different agents need to be
added to the table. As a third step the game is kicked off with a reset. Agents with autoplay set to True will automatically
play, by having the action method called of their class. Alternatively you can use the PlayerShell class
and the environment will require you call call the step function manually and loop over it. This may be helpful
when using other packages which are designed to interface with the gym, such as keras-rl.

Adding a new model / agent
^^^^^^^^^^^^^^^^^^^^^^^^^^

An example agent can be seen in random\_agent.py

To build a new agent, an agent needs to be created, where the follwing
function is modified. You will need to use the observation parameter,
which contains the current state of the table, the players and and the
agent itself, as a parameter to determine the best action.

.. code:: python

    def action(self, action_space, observation):  # pylint: disable=no-self-use
        """Mandatory method that calculates the move based on the observation array and the action space."""
        _ = observation  # not using the observation for random decision
        this_player_action_space = {Action.FOLD, Action.CHECK, Action.CALL, Action.RAISE_POT, Action.RAISE_HAlF_POT}
        possible_moves = this_player_action_space.intersection(set(action_space))
        action = random.choice(list(possible_moves))
        return action

Observing the state

The state is represented as a numpy array that contains the following information:

.. code:: python

class CommunityData:
    def __init__(self, num_players):
        self.current_player_position = [False] * num_players  # ix[0] = dealer
        self.stage = [False] * 4  # one hot: preflop, flop, turn, river
        self.community_pot: float: the full pot of this hand
        self.current_round_pot: float: the pot of funds added in this round
        self.active_players = [False] * num_players  # one hot encoded, 0 = dealer
        self.big_blind
        self.small_blind

class StageData:  # as a list, 8 times:
    """Preflop, flop, turn and river, 2 rounds each"""

    def __init__(self, num_players):
        self.calls = [False] * num_players  # ix[0] = dealer
        self.raises = [False] * num_players  # ix[0] = dealer
        self.min_call_at_action = [0] * num_players  # ix[0] = dealer
        self.contribution = [0] * num_players  # ix[0] = dealer
        self.stack_at_action = [0] * num_players  # ix[0] = dealer
        self.community_pot_at_action = [0] * num_players  # ix[0] = dealer

class PlayerData:
    "Player specific information"

    def __init__(self):
        self.position: one hot encoded, 0=dealer
        self.equity_to_river: montecarlo
        self.equity_to_river_2plr: montecarlo
        self.equity_to_river_3plr: montecarlo
        self.stack: current player stack

How to integrate your code on Github



It will be hard for one person alone to beat the world at poker. That's
why this repo aims to have a collaborative environment, where models can
be added and evaluated.

To contribute do the following:

- Get Pycharm and build the virtual python environment. Use can do: ``pip install -r requirements.txt``
- If you want to use the 500x faster c++ based equity calculator, also install visual studio, but this is not necessary
- Clone your fork to your local machine. You can do this directly from pycharm: VCS --> check out from version control --> git
- Add as remote the original repository where you created the fork from and call it upstream (the connection to your fork should be called origin). This can be done with vcs --> git --> remotes
- Create a new branch: click on master at the bottom right, and then click on 'new branch'
- Make your edits.
- Ensure all tests pass. Under file --> settings --> python integrated tools switch to pytest (see screenshot). |image1| You can then just right click on the tests folder and run all tests. All tests need to pass. Make sure to add your own tests by simply naming the funtion test\_... \
- Make sure all the tests are passing. Best run pytest as described above (in pycharm just right click on the tests folder and run it). If a test fails, you can debug the test, by right clicking on it and put breakpoints, or even open a console at the breakpoint: https://stackoverflow.com/questions/19329601/interactive-shell-debugging-with-pycharm
- Commit your changes (CTRL+K}
- Push your changes to your origin (your fork) (CTRL+SHIFT+K)
- To bring your branch up to date with upstream master, if it has moved on: rebase onto upstream master: click on your branch name at the bottom right of pycharm, then click on upstream/master, then rebase onto. You may need to resolve soe conflicts. Once this is done, make sure to always force-push (ctrl+shift+k), (not just push). This can be done by selecting the dropdown next to push and choose force-push (important: don't push and merge a rebased branch with your remote)
- Create a pull request on your github.com to merge your branch with the upstream master.
- When your pull request is approved, it will be merged into the upstream/master.

.. |image0| image:: doc/pots.png
.. |image1| image:: doc/pytest.png
.. |image2| image:: doc/tensorboard-example.png