mit-acl / gym-collision-avoidance

MIT License
242 stars 74 forks source link

Agents' abnormal behaviors #19

Closed jx2518 closed 11 months ago

jx2518 commented 11 months ago

Hi! As I am exploring the project, I followed the instruction in the documentation by starting from _gym_collisionavoidance/experiments/src/example.py and tried to edit it and change some behaviors of the agents. And then I found some abnormal agent behaviors. I wonder if they come from the flaws in the policies or because I coded them in a wrong way?

First, I let an agent run in CADRL policy (in orange) while another run in noncoop (as reference obstacle); when I increase the range of the destination to (25,25), the agent doesn't move.

image

But with the same setup, switching to GA3C_CADRL policy then would work.

image

However, after increasing the range of destination to 35, GA3C_CADRL policy fails to work too.

image

Could you please take a look at this? Thank you very much!

mfe7 commented 11 months ago

Thanks for sharing this behavior and screenshots! I am not too surprised by this behavior, since the subgoals are much further away than what was normally seen during training. Also, the CADRL policy was trained in "smaller" scenarios compared to GA3C-CADRL, so that makes sense to me as well. In practice, we treat these policies as local planners, so we usually have a global planner that takes goals at the range of ~25m and regularly (~10Hz) provides a local subgoal ~5-10m in front of the vehicle that should lead toward the longer goal. Alternatively, one could re-train the policies in larger environments, but that's not something we tried much.

jx2518 commented 11 months ago

Thank you so much for the reply and explantions; that makes a lot of sense to me! Would you possibly able to point me to the part of the code in the repo where I can gain more understanding about the local planner and/or global planner?

mfe7 commented 11 months ago

There isn’t really any code related to that distinction here in this repo. The cadrl_ros repo is an example of a local planner that wraps around the ga3c-cadrl policy. Bruno Briton’s 2021 RAL paper talks about one way for handling the interface btwn global and local planners, for example

mfe7 commented 11 months ago

https://arxiv.org/abs/2102.13073

jx2518 commented 11 months ago

Gotcha! Thank you so much for your help! I actually do have one more unrelated question. But to keep a cleaner format, I will open another request. I hope my continuous questions don't bother you too much!