Why trained policy is not as good as yours

Acmece / rl-collision-avoidance

Implementation of the paper "Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning"

https://arxiv.org/abs/1709.10082

326 stars 92 forks source link

Why trained policy is not as good as yours #18

Open gwhan98 opened 3 years ago

gwhan98 commented 3 years ago

Hi, I followed all your steps and trained the policy from scratch for stage 1.

I am not able to get a policy as good as yours (still always crashes) even after training for 12 hours.

May I ask if you used anything special to train the policy? I have tried many times but cannot get a good policy, and starting from scratch seems very bad.

Acmece commented 3 years ago

It's hard to say. You may train a longer time to see the performance. I have used three machines for distributed training, for your information.