datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
http://www.rlcard.org
MIT License
2.86k stars 618 forks source link

cfr for doudizhu: TypeError: 'NoneType' object is not iterable #44

Closed AIMan-Zzx closed 4 years ago

AIMan-Zzx commented 4 years ago

File "/rlcard/agents/cfr_agent.py", line 72, in traverse_tree utility = self.traverse_tree(new_probs, player_id) File "/rlcard/rlcard/agents/cfr_agent.py", line 71, in traverse_tree self.env.step(action) File "/rlcard/rlcard/envs/env.py", line 62, in step next_state, player_id = self.game.step(self.decode_action(action)) File "/rlcard/envs/doudizhu.py", line 94, in decode_action for legal_action in legal_actions: TypeError: 'NoneType' object is not iterable

daochenzha commented 4 years ago

@sunnyForIOS Could you provide more details so that we can reproduce the bug? Such as how many iterations you have run before you see the bug? Have you specified any random seed?

AIMan-Zzx commented 4 years ago

@daochenzha the issue disappear when i use the latest updated codes, but another problem appear, how long it need one epoch for cfr with doudizhu? i run it in a server two days without result

daochenzha commented 4 years ago

@sunnyForIOS Sounds good! It is hard to tell. Valina CFR may not work well for Dou Dizhu due to the extremely large search space. Some parallelization or abstraction techniques may need to be applied to enhance CFR. Another direction would be sampling-based methods. For example, NFSP is a basic example of the sampling-based method. Monte Carlo Tree Search may also be useful in this case. I can not tell which one would work. What algorithms would work best for Dou Dizhu is an open question and need further exploration.