Closed kaiyama12345679 closed 11 months ago
Hello. Because when I implemented the code, I followed the original HATRPO/HAPPO repo[1] and A2PO repo[2] to use a globally observable setting, so as to make the results comparable. However, the code can be modified to accommodate the partially observable settings.
[1] https://github.com/cyanrain7/TRPO-in-MARL/blob/e412f13da689f7b51750caf09b1a3567970550ad/envs/ma_mujoco/multiagent_mujoco/mujoco_multi.py#L147
[2] https://github.com/xihuai18/A2PO-ICLR2023/blob/db87ed05554a23ef3e27289b7981e0a2eb838bed/onpolicy/envs/ma_mujoco/multiagent_mujoco/mujoco_multi.py#L187 (Their self.agent_obsk
is None
)
Thanks for your reply. I've grasped it.
@Ivan-Zhong Do you mind sharing code with partial observation? Also, are all environments implemented with global observability? Thanks.
Hi, I have a question about MAMujoco. I would like to know why you are using "state" instead of get_obs_agent(agent_id) as each agent's observation.
Thanks.
https://github.com/PKU-MARL/HARL/blob/7eda8202a0a4ffb6be15014a9f88ea4afc345b66/harl/envs/mamujoco/multiagent_mujoco/mujoco_multi.py#L209C45-L209C45