Farama-Foundation / HighwayEnv

A minimalist environment for decision-making in autonomous driving
https://highway-env.farama.org/
MIT License
2.56k stars 732 forks source link

On the use of continuous control methods at intersections #511

Open yshichseu opened 11 months ago

yshichseu commented 11 months ago

Dear author, thank you for your previous guidance, which has helped me solve a lot of problems. Currently, in my research on intersection motion planning, I hope to introduce algorithms such as DDPG into self-driving car control at intersections. However, when I changed the control method to be kinematically continuous and enabled longitude, I found that the vehicle would directly deviate from the road boundary during operation, resulting in no useful results. I would like to know if this problem can be solved by changing the reward function or increasing the number of training iterations, or what settings are needed to ensure that the vehicle can operate within the given lane as much as possible. Thank you very much! 1694853719599

yshichseu commented 11 months ago

Hello dear author! I have conducted a large amount of training (10k) based on the DDPG algorithm, hoping that the vehicle can learn not to cross the lane line through training (combined with training to terminate when encountering a lane boundary), but the results have not changed much. I made modifications to my own environment based on the intersection class, only enabling continuous control and longitudinal control. I believe that there may be problems with the design of my environment, but as a beginner, I don't know how to make proper modifications, I hope to receive your guidance! Thank you!

huang6668 commented 11 months ago

I asked the same question some time ago #258

yshichseu commented 11 months ago

Thank you for your reply. I have read your content in this question. Have you resolved it in the end? I really hope to introduce continuous control into the intersection environment. Can you provide me with further guidance? Thank you!

I asked the same question some time ago #258

huang6668 commented 11 months ago

Sorry, I didn't continue researching this later on. However, I found that if I increase the penalty for deviating from the road, after training for a while, the agent will continue straight along the road. But when it reaches an intersection, the agent doesn't know which way to turn. Maybe what the author mentioned about "reward has a term for getting closer to the goal location" and "Also dont forget that the observation should also include information about the path / destination."are needed in this case.

eleurent commented 11 months ago

Hey, Yes, for lane tracking and continuous control to work you should ensure two things: A. That the agent's observation contains information about what the desired trajectory is, and where it is currently located (+speed) with respect to this desired trajectory. B. That the env's reward function penalizes the agent from deviating too much from the desired trajectory.

Both of these conditions are satisfied e.g in the racetrack-env, which is a lane-following continuous control environment example.