the convergence of network

reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.

MIT License

488 stars 98 forks source link

the convergence of network #38

Closed cx-cheng closed 1 year ago

cx-cheng commented 1 year ago

Thank you for your open source code and it's helpful to me. I tried to reproduce your experiment, but I couldn't get good navigation performance. After many episodes of training, the mobile robot had no ability to avoid obstacles. Could you tell me what is the final navigation accuracy of your experiment.

reiniscimurs commented 1 year ago

Hi,

See a bit of explanation here: https://github.com/reiniscimurs/DRL-robot-navigation/issues/19

A fully trained network should be able to execute 97~99% of episodes without collisions, but it does not reach 100% collision free behavior.

cx-cheng commented 1 year ago

Thank you for your reply. How many rounds have you trained. I tried to take laserscan data as input instead of velodyne, but the effect was poor. I was confused about this, and could you me some suggestions?

reiniscimurs commented 1 year ago

Usually I would train for 100 epochs, that generally converges well. However, you can see proper behavior around 20 to 40 epochs.

Laserscan inputs should work and there is no huge difference in performance, as long as you bag the data correctly. You would have to explain what process you went through to set up training with laser scan, for me to give any suggestions. But as said, generally there is no difference between using flattened Velodyne scans or laser scans directly.