reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
571 stars 119 forks source link

Robot keeps circling around #19

Closed Trevor233 closed 2 years ago

Trevor233 commented 2 years ago

Hi Reinis it's me again. When I run your program I found that after several training the robot just stays put and keeps circling around

https://user-images.githubusercontent.com/104433600/172398376-785704f1-d439-463d-a2b1-bb4216a8ecf3.mp4

And the rewards just remains the same 2022-06-07 21-37-42屏幕截图

Could you tell me where the problem is, thanks.

reiniscimurs commented 2 years ago

Hi, the links to video and image seem to be from your local directory so I cannot access them.

After how many episodes do you see this behavior? Also, what is the terminal output?

Trevor233 commented 2 years ago

2022-06-07 21-57-43屏幕截图 the rewards just stay the same and hardly have some change

reiniscimurs commented 2 years ago

If nothing has changed in the code, you can try to change the random seed for the initialization, or just simply run the training again. See if that helps. In this example it seems like the training is just not learning anything significant, so you might just want to restart it.

Trevor233 commented 2 years ago

Hi I just restart the program several times and it still has nothing changed. The agent seems doesn't go to the goal and after several collision it just stay put

reiniscimurs commented 2 years ago

Did you try changing the seed value? https://github.com/reiniscimurs/DRL-robot-navigation/blob/be811a4050dfcb5c800a0a0e4000be81d48cfbc5/TD3/velodyne_td3.py#L183

Trevor233 commented 2 years ago

Hi thanks for your suggestion. I changed the seed value and the agent seems can finally learn from the training. But sometimes situation like this could happen.It appears that the agent doesn't know to go to the goal. Would you mind telling me how this happened? 2022-06-10 09-20-24屏幕截图

reiniscimurs commented 2 years ago

In any optimization problem, the starting location of your optimization algorithm and the step size are very important factors to its success. Similar thing is most likely happening here, where the random initialization of network and its weights are located in a local optimum, and any sort of learning with the set learning rate does not allow it to escape it with the current information at hand. So either you need to find a seed value that will initialize the network in a way that works out well, or need to change learning rate, or use some sort of bootstrapping, or change the reward function, or any other method. Essentially, perform hyperparameter tuning.

For me though, the implementation in this repo has worked fairly well and is usually training fine but that could just be my setup. If you have changed something in the code, that can be something to look at as well and see if that should influence the training.

reiniscimurs commented 2 years ago

Closing due to inactivity

vamsi8106 commented 11 months ago

Closing due to inactivity

Please, May I know the seed value to be set? I have tried many values, but the robot is not converging.