reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
488 stars 98 forks source link

robot just circling around #13

Closed samwooseTW closed 2 years ago

samwooseTW commented 2 years ago

Hi Mr. Reinis Cimurs.

I have already emailed the same question to the email address I found on the other issue post.

I have been following and enjoyed your research.

It is amazing to see you have trained a robot to explore the unknown environment

and create a map.

I want to train my robot to have this ability so that my robot can explore unknown environments without collision.

I believe a good way to start is to try to reproduce the training result you had in https://github.com/reiniscimurs/DRL-robot-navigation repository.

However, I have trained the robot for almost 2 days and 63 evaluation has been done and the robot still shows weird behavior such as circling around not hitting the wall or obstacle so that it can avoid the worst case scenario which is a collision.

I have run your program under the following environment. Ubuntu 18.04 ROS melodic Python3 cuda 10.2 Geforce graphics card

There are a few suspicious things I want to point out.

  1. I am getting tf error message on RViz telling me as follows.

"for frame [r1/front_laser]:No transform to fixed frame [base_link] tf error: lookup would require extrapolation into the future. Requested time ~ but the latest data is at time ~ when looking up transform from frame [r1/front_laser] to frame [base_link]"

=>This error message keeps appearing and disappearing.

  1. I don't see gazebo environment when I run the velodyne_td3.py => I am not sure it is normal. I am guessing the gazebo is not shown because GUI option is set to false.

So my questions are Q1. Do I need to fine-tune to train the robot successfully? i.e. I want to robot to show the behavior like your robot does on your repository.

Q2. How long does it take to get the expected behavior?(i.e. no collision and successfully navigating to the given goal)

Q3. Are there some tricks and tips I should've known before I run your TD3 program?

Thank you for any help in advance.

Best,

*tf error https://drive.google.com/file/d/1QK8YxdSSVKZgA7mnzIbk1M4pWwS_Kpsz/view?usp=sharing

*circle behavior https://drive.google.com/file/d/1zO5x73s4CcbEEpgZ69MibqZK3I0h_aKW/view?usp=sharing

reiniscimurs commented 2 years ago

After working through this issue it was found that the TF error for simulated laser sensor appears due to a too high update rate of the laser sensor data. In such case, a solution can be to lower the update rate in the laser xacro file to a lower value: https://github.com/reiniscimurs/DRL-robot-navigation/blob/be811a4050dfcb5c800a0a0e4000be81d48cfbc5/catkin_ws/src/multi_robot_scenario/xacro/laser/hokuyo.xacro#L64

KhuongDiep911 commented 1 year ago

Hello Reinis Cimurs,

I had check the update_rate parameter and see that this value has already set to 100. I think it has a lower value than your previous comment (1000). I have trained the robot with 127 epochs but it seems that the robot is still circling around. Is there another way to fix it?

Thank you in advance.

reiniscimurs commented 1 year ago

The hakuyo laser is not used anymore in this repository for training. It is only there for visualization in Rviz, so it's rate is not influencing anything anymore. Now we only use the velodyne puck for state information in training.

The success of training is somewhat random. Try restarting the training or using a differend seed value for random seeds. Usually you can see if the network is learning in first 20 to 40 epochs. If it isn't, then restart the training again.

KhuongDiep911 commented 1 year ago

Thanks for your reply!

I have changed the seed number as well as tried to adjust the learning rate of the model. I will let you know when I get the result.

KhuongDiep911 commented 1 year ago

Hi Reinis Cimurs,

I decreased the learning rate to 0.00001, I have trained it with 89 epochs and the robot stops circling around. I don't know what will happen next but now it works.

In addition, I sent an email to your github email (Reinis.Cimurs@de.Bosch.com) to ask about your paper and some advice. I hope that it might find you well. Please let me know if there are any problems.

Thank you very much for your help!