Closed ArshartCloud closed 1 year ago
@ArshartCloud You are right. The state features are usually very important in training agents. The wrapper here is just an example, which is not necessarily the best. You can customize the env wrapper to do better feature engineering.
I have a follow-up question about the observation space of limit-texas: Looks like the observation only contains disclosed cards, instead of distinguishing private cards and community cards. Isn't this representation a bit problematic when inferring other players' strategies?
@cuijiaxun Yeah, that is true. The state space of Texas Hold'em is not carefully designed. I expect the agent will be much stronger if tuning state features, like what we have done in DouDizhu game. The state representation of AlphaHoldem could be borrowed here
In obs describiton, it says that :
however, in line 70 of nolimitholdem.py
obs[53] = float(max(all_chips))
it use max rather then sum, which means the highest chip any player (however, if opponent chip is lower then you, the game ends, so it can only be opponents chip) put in the pot. Is that right?