Replicable-MARL / MARLlib

One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
https://marllib.readthedocs.io
MIT License
880 stars 142 forks source link

There is a bug in def central_value_function(self, state, opponent_actions=None) in cc_mlp.py and needs to be modified. #200

Open Maxwell-R opened 10 months ago

Maxwell-R commented 10 months ago

If you use MultiDiscrete in Gym as the action space class, and use the mappo algorithm, an error will occur:

_File "D:\anaconda3\envs\ray\lib\site-packages\marllib\marl\models\zoo\mlp\cc_mlp.py", line 125, in central_value_function x = torch.cat([x.reshape(B, -1)] + opponent_actionsls, 1) RuntimeError: Tensors must have same number of dimensions: got 2 and 3

This is because the MultiDiscrete situation is not handled in def central_value_function(self, state, opponent_actions=None) of cc_mlp.py, resulting in different dimensions between opponent_actions_ls and [x.reshape(B, -1)].

====== 如果使用Gym的MultiDiscrete作为行动空间,同时使用MAPPO类的算法(用了cc_mlp.py的算法都会出现问题)将会出现报错: _File "D:\anaconda3\envs\ray\lib\site-packages\marllib\marl\models\zoo\mlp\cc_mlp.py", line 125, in central_value_function x = torch.cat([x.reshape(B, -1)] + opponent_actionsls, 1) RuntimeError: Tensors must have same number of dimensions: got 2 and 3 这是因为在cc_mlp.py文件里的def central_value_function(self, state, opponent_actions=None)函数中没有考虑MultiDiscrete作为动作空间的情况,从而使拼接时维数不兼容。请加上MultiDiscrete情况下opponent_actions_ls的计算结果。

Theohhhu commented 9 months ago

Thank you for bringing this to our attention. We are currently investigating the issue.