Closed zzz259758 closed 3 years ago
@zzz259758 DQN 是off-policy算法,对内存要求比较高,可以尝试一些off-policy的方法比如PPO A2C
maybe you can try the pytorch implementation? I know the tensorflow nfsp impl would eat up 500gb of ram for my game. I'm not sure if that is due to a programming error on my part or the agent code. With pytorch it barely uses 3gb after 50000 episodes
用doudizhu_nfsp 训练一段时间 通常是5分钟后就出现以下错误 INFO - Agent nfsp1_dqn, step 5000, rl-loss: 0.2763792872428894 INFO - Copied model parameters to target network. INFO - Agent nfsp2_dqn, step 5000, rl-loss: 0.2676895558834076 INFO - Copied model parameters to target network. INFO - Agent nfsp0_dqn, step 17416, rl-loss: 0.05982016772031784 进程已结束,退出代码-1073740791 (0xC0000409) 我怀疑是内存的问题 我内存是12个G 培训的时候几乎是满的
一般应该是使用什么样的配置来进行培训?