reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
571 stars 119 forks source link

How to solve: The program does not converge and the collision rate is always 1 #68

Closed DanDan996 closed 11 months ago

DanDan996 commented 1 year ago

train_file train_file After I looked at many questions and your answers, I changed the seed values of the code (0, 1, 2, 3, etc.), but invariably I ended up with training results similar to this:

Average Reward over 10 Evaluation Episodes, Epoch 48: -106.461269, 1.000000

And at this point the carriage will perform 2023-07-06 22-42-21 的屏幕截图s, even if its position changes and it continues to perform circular motions. This phenomenon does not occur at the very beginning, in the first few epochs we see that the vehicle's strategy is at least normal although there are many collisions, but the later epochs show wrong training results. I don't know how to modify the seed value, are there any other places that I didn't notice?

reiniscimurs commented 1 year ago

Hi,

It is very hard to say what the issue is here without more information. Please provide full log output of the terminal as well as your system setup and maybe an rviz screenshot.

Additionally, try to take a look at the gazebo simulation and if there are any issues there.

DanDan996 commented 1 year ago

Glad to see your reply, I now want to run the program again. And will give new results and what I have set up later. This will take some time, please wait for my reply.

Hi,

It is very hard to say what the issue is here without more information. Please provide full log output of the terminal as well as your system setup and maybe an rviz screenshot.

Additionally, try to take a look at the gazebo simulation and if there are any issues there.