Closed DoxakisCh closed 1 year ago
Issue #6669 implement self-play with PPO via multi-agent. But in PokerGame, the opponent-agent compute_action must base on the observation after rl-agent's step. So i don't think multi-agent is the proper way to implement PokerGame's self-play.
If you have any progress on PokerGame's selfplay, welcome to communicate with me.
Hello,
I want to use the AlphaZero agent of rllib on a poker environment that will learn to play via self-play. I understand that the current agent is designed only for single player games. Is there any way to extend it somehow in order to learn via self-play on Two-player adversarial games like chess and heads up poker?