Closed Cristal-yin closed 3 years ago
Hello author, I am a domestic graduate student. Recently I am planning to graduate and plan to improve the algorithm of the game class. Since I have just been getting started, I have been focusing on your project. I want to ask if this project can be used on actor-critic algorithm(such as A3c or DDPG) ,i plan to use your platform and another algorithm and then improve it ?
@Cristal-yin Thanks for your interest. We do not plan to further optimize the speed of Dou Dizhu for now since we may have to implement it with C++ for further optimization.
Yes, it is possible to connect the game engine to A3C. But it may require some efforts since A3C is a multi-process algorithm. What I have in mind is to use the single-agent mode in RLCard, where the interfaces would be openAI gym like with the other agents as rule-based models. See https://github.com/datamllab/rlcard#api-cheat-sheet
Doudizhu is a challenging game. It may be difficult to train RL from scratch. To make the training feasible, I would recommend first generating training data using our rule model in https://github.com/datamllab/rlcard/blob/master/rlcard/models/doudizhu_rule_models.py
Then use supervised learning (SL) to train the agent. After SL stage, we then continue training with RL. This should be easier than using pure RL
Also, we may need better neural architecture such as CNN and LSTM (currently it is just MLP)
Thank you very much for your answer😊
Strong DouDizhu agent is suoported at https://github.com/datamllab/rlcard/tree/master/rlcard/agents/dmc_agent
It is also supported at https://github.com/kwai/DouZero
any news on doudizhu optimizations?