HumanCompatibleAI / adversarial-policies

Find best-response to a fixed policy in multi-agent RL
MIT License
275 stars 47 forks source link

How to modify the win condition? #60

Open AndssY opened 1 year ago

AndssY commented 1 year ago

Thanks for your nice work! I try to reproduce this work by writing it myself, but I got some questions on the win condition of sumo humans. I noticed the win condition in the paper is modified, A player wins by remaining standing after their opponent has fallen. So, How can I modify the win condition? Can you Can you please give more detailed instructions? When should I modify the win condition? Does ZooON v.s. ZooON use Bansal's win conditon or use the modified version? And same as Rand/Zero v.s. ZooON.

AdamGleave commented 1 year ago

The fork https://github.com/HumanCompatibleAI/multiagent-competition already contains the modified win condition. From memory I think we just removed the requirement that the agents touch each other for it to be a win (i.e. if an agent falls over, the other agent has won even if it didn't tackle the other agent).

AndssY commented 1 year ago

From memory I think we just removed the requirement that the agents touch each other for it to be a win

Thanks a lot!

But forgive my coding ability is not so good. In my understanding, the function def goal_rewards(self, infos=None, agent_dones=None) in the file multiagent-competition/gym_compete/new_envs/sumo.py determined the win condition of Sumo environment.

Specifiacally, that is in lines 82~105. But it(both the fork and after checkout to 3a3f9dc) seemed to be same as the unmodified version, the agent won't win if it didn't touch the other agent, so I got confused. image

I wonder if any details been forgotten? But I will try to train adversarial policy by removing the requirement of touch.