hijkzzz / pymarl2

Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
https://iclr-blogposts.github.io/2023/blog/2023/riit/
Apache License 2.0
632 stars 124 forks source link

QMIX在corridor地图下胜率一直为0,且会出现非法ACTION错误 #32

Closed swordest closed 1 year ago

swordest commented 1 year ago

使用qmix_high_sample_efficiency.yaml情况下在较难的corridor地图下,test_battle_won_mean一直为0(实际上qmix.yaml也一样)。请教下那个地方可能出了问题,附上了完整的输出文件。

另外还有个比较随机的问题,某几次实验室的时候,会出现action不合法导致的assertion错误,导致程序直接退出(如下面的cout所示),不知道这个问题是否有解决方案?

盼回复。谢谢!

[INFO 10:16:50] my_main t_env: 3086524 / 10050000 [INFO 10:16:50] my_main Estimated time left: 22 hours, 38 minutes, 44 seconds. Time passed: 6 hours, 22 minutes, 13 seconds [INFO 10:16:58] my_main Recent Stats | t_env: 3086524 | Episode: 81500 battle_won_mean: 0.0000 ep_length_mean: 42.6525 epsilon: 0.0500 grad_norm: 0.2738 loss_td: 0.0163 q_taken_mean: 0.6618 return_mean: 9.7562 return_std: 1.3360 target_mean: 0.6574 td_error_abs: 0.0163 test_battle_won_mean: 0.0000 test_ep_length_mean: 43.0625 test_return_mean: 10.0525 test_return_std: 1.6540
[INFO 10:17:52] my_main Updated target network Process Process-2: Traceback (most recent call last): File "/home/ts1-guest/anaconda3/envs/pymarl/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/ts1-guest/anaconda3/envs/pymarl/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/ts1-guest/RL_study/pymarl2/src/runners/parallel_runner.py", line 233, in env_worker reward, terminated, env_info = env.step(actions) File "/home/ts1-guest/RL_study/pymarl2/src/envs/starcraft/StarCraft2Env.py", line 406, in step sc_action = self.get_agent_action(a_id, action) File "/home/ts1-guest/RL_study/pymarl2/src/envs/starcraft/StarCraft2Env.py", line 477, in get_agent_action assert avail_actions[action] == 1, \ AssertionError: Agent 1 cannot perform action 14 CloseHandler: 127.0.0.1:54100 disconnected cout.txt

hijkzzz commented 1 year ago

我分析了一下

一种可能是你用的SC2.4.6,这个版本下我记得应该去调整探索步数到 50W。 我自己测试用的 SC2.4.10 + PyTorch 1.12.1

还有一个可能的问题我注意到 smac 框架最近频繁更新 https://github.com/oxwhirl/smac/commits/master

而我这边的 starcraftenv.py 没有同步更新,这个可以优先排查 回退方法 pip uninstall smac pip install git+https://github.com/oxwhirl/smac.git@456d133f40030e60f27bc7a85d2c5bdf96f6ad56

hijkzzz commented 1 year ago

我已经重新跑了这张图,可以复现效果。所以问题出在smac版本或者SC版本上,你可以排查一下。

image

swordest commented 1 year ago

感谢详细回复。我用的是SC2.4.10 + PyTorch 1.12.1,这两天测试下把smac版本回退一下,看看效果如何