wenkesj / holdem

:black_joker: OpenAI Gym No Limit Texas Hold 'em Environment for Reinforcement Learning
162 stars 62 forks source link

Environment is requesting player moves when there's only one player who is not all-in #15

Open BigBadBurrow opened 5 years ago

BigBadBurrow commented 5 years ago

I found a problem if all but one players are all-in, i.e. only one player has stack available. The environment continues to ask the player that isn't all-in for a move, which can then mean the agent can (needlessly) continue to raise against players who obviously wouldn't be able to call (so it creates a sidepot with only that player in), as you can see here:

    players:
    0 [  ],[  ] stack: 0
    1 [2♥],[8♦] stack: 0
    2 [J♦],[7♥] stack: 28665
    3 [  ],[  ] stack: 0
    Getting move from agent for player 1 (Agent: 2)
    Player 1 Move: [0,0]
    total pot: 11335
    last action by player 1:
    _ check
    community:
    - [9♣],[4♦],[A♣],[J♥],[  ]
    players:
    0 [  ],[  ] stack: 0
    1 [2♥],[8♦] stack: 0
    2 [J♦],[7♥] stack: 28665
    3 [  ],[  ] stack: 0
    Getting move from agent for player 2 (Agent: 3)
    Minimum raise: 25  As percentage is : 0.0008721437292865864
    Player 2 Move: [2,2867]
    Player 2 ('raise', 2867)
    total pot: 14202

Here you can see player1 checked because it's all-in, but the environment continued to ask player2 for a move, who then raised against a player who was already all-in, which obviously doesn't make sense. In fact, in this case it then repeated this again when the last community card was dealt.

In env.py there's this line:

if not self._current_player.playedthisround and len([p for p in players if not p.isallin]) >= 1:

Initially I though the >= should be changed to > 1 but there'd be a problem if player1 had raised and gone all-in, and then due to the above code change it wouldn't have asked Player2 for a move as there's now only one player that's not all-in.

I could of course program my agent to check if all other players are all-in and if so then disable the ability to raise, but I think the environment should handle this better. I'm just not sure how...

VinQbator commented 5 years ago

Can you test if you have similar issue with the https://github.com/VinQbator/holdem fork?

VinQbator commented 5 years ago

Should be fixed in my fork now.