Acmece / rl-collision-avoidance

Implementation of the paper "Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning"
https://arxiv.org/abs/1709.10082
326 stars 92 forks source link

Output from training #24

Open juliogodoy opened 2 years ago

juliogodoy commented 2 years ago

Hello, Thank you very much for providing this code. A student and I have been following the training example for Stage1, but when one of the environments reaches the max number of episodes it looks like the code enters an infinite loop, and the other environments do not seem to be continuing their iterations. Is this supposed to occur? if not, what is the expected output after the number of episodes is completed ?

Thanks,

Julio Godoy

Mealoore commented 2 years ago

I think the reason may be the synchronization of processes. Some processes have finished but others are still running, so they will wait for processes whose are finished.