TempleRAIL / drl_vo_nav

[T-RO 2023] DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
https://doi.org/10.1109/TRO.2023.3257549
GNU General Public License v3.0
113 stars 7 forks source link

Training problems #25

Closed xcslab closed 1 month ago

xcslab commented 2 months ago

Hello, I really appreciate some of the work you have done! At present, when I started training your DRL-VO directly, I encountered some problems, including the training loss curve and the related curve is too wobbly, in the early stage of training, the agent can move towards the target point, in the training 2-3 days, the agent will only rotate in place, will not move forward, is it convenient to explain the training setup problem? How can I set the parameters to achieve the same effect as you trained the model?

zzuxzt commented 2 months ago

Thanks for your interest in our work. DRL training is always a tricky problem, especailly the robot navigation in a complex dynamic environment. Since DRL is sensitive to random seed initialization and to the dynamics of the training process, and we also have different hardware devices, it is diffcult to figure out why it does not work well. The rotate situation happens in sometimes, or a sudden breakdown after a long period of good training. It is normal in the training. Sometimes this sudden crash can be recovered automatically with a longer time training, sometimes not. You could try retraining it by resampling on a different random seed, try lowering the learning rate, or increase the entropy loss appropriately. In addition, the DRL training also allows you select a best trained model for inference before the sudden breakdown.

Frankly speaking, I am also always stuggle with training a good DRL model. DRL training is an art. All in all, I am not 100% confident that you can train a great model like mine, but I can only guarantee that you can train a model that can work. By the way, the current training environment uses ground truth data of pedestrians, which is a bit different from the environment I originally trained on. This may also have an impact in some ways but I do not think it affect too much. Good luck for your training.

xcslab commented 1 month ago

Thank you very much for your reply. I will continue to study this work of yours.