reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
487 stars 97 forks source link

change gazebo environment and training #62

Closed hzxBuaa closed 1 year ago

hzxBuaa commented 1 year ago

Hello, dear Reinis Cimurs,

First of all, thank you for publishing such great work. I want to use this model for our own custom robot as a base.

I simplified the Gazebo environment (modify TD3.world), keep only the surrounding walls and four cardboard boxes. But when I'm training, it doesn't converge, the car circles in place. Can I ask you a favour? Why is this?

Thank you so much!

reiniscimurs commented 1 year ago

Hi,

I would not be able to have any guesses here. There is not enough information here. Please provide what exactly was changed in the code and how you are calling the scripts. Are you also using a different robot model when training?

hzxBuaa commented 1 year ago

HI,

I just made deletions in the “TD3.world” file.

Obstacles such as tables and stools are deleted, and only the surrounding walls and four random obstacles (four cardboard boxes) are kept.

The rest did not change, nor did the robot model.

reiniscimurs commented 1 year ago

Can you provide the full terminal output?

hzxBuaa commented 1 year ago

yes

$ python train_velodynetd3.py Roscore launched! ... logging to /home/hzx/.ros/log/2559db34-0f4a-11ee-bf50-bb6f994d1be6/roslaunch-hzx-System-Product-Name-1942727.log Checking log directory for disk usage. This may take a while. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

Unable to register with master node [http://localhost:11311]: master may not be running yet. Will keep trying. started roslaunch server http://localhost:40167/ ros_comm version 1.16.0

SUMMARY

PARAMETERS

NODES

auto-starting new master process[master]: started with pid [1942737] ROS_MASTER_URI=http://localhost:11311/

setting /run_id to 2559db34-0f4a-11ee-bf50-bb6f994d1be6 process[rosout-1]: started with pid [1942747] started core service [/rosout] Gazebo launched! ... logging to /home/hzx/.ros/log/2559db34-0f4a-11ee-bf50-bb6f994d1be6/roslaunch-hzx-System-Product-Name-1942756.log Checking log directory for disk usage. This may take a while. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://localhost:35697/

SUMMARY

PARAMETERS

NODES / gazebo (gazebo_ros/gzserver) gazebo_gui (gazebo_ros/gzclient) joint_state_publisher (joint_state_publisher/joint_state_publisher) robot_state_publisher (robot_state_publisher/robot_state_publisher) rviz (rviz/rviz) urdf_spawner (gazebo_ros/spawn_model)

ROS_MASTER_URI=http://localhost:11311/

process[gazebo-1]: started with pid [1942774] process[gazebo_gui-2]: started with pid [1942778] process[urdf_spawner-3]: started with pid [1942784] process[robot_state_publisher-4]: started with pid [1942785] process[joint_state_publisher-5]: started with pid [1942786] process[rviz-6]: started with pid [1942787] [INFO] [1687252163.146121, 0.000000]: Loading model XML from ros parameter robot_description [ INFO] [1687252163.149604285]: Finished loading Gazebo ROS API Plugin. [ INFO] [1687252163.150072344]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting... [INFO] [1687252163.151152, 0.000000]: Waiting for service /gazebo/spawn_urdf_model [ INFO] [1687252163.184770754]: Finished loading Gazebo ROS API Plugin. [ INFO] [1687252163.185148864]: waitForService: Service [/gazebo_gui/set_physics_properties] has not been advertised, waiting... [ INFO] [1687252163.856449919]: waitForService: Service [/gazebo/set_physics_properties] is now available. [ INFO] [1687252163.865734237]: Physics dynamic reconfigure ready. [INFO] [1687252164.054472, 0.000000]: Calling service /gazebo/spawn_urdf_model [ INFO] [1687252164.827059887, 0.201000000]: Camera Plugin: Using the 'robotNamespace' param: '/' [ INFO] [1687252164.827955047, 0.201000000]: Camera Plugin (ns = /) , set to "" [ INFO] [1687252164.844081643, 0.201000000]: Camera Plugin: The 'robotNamespace' param was empty [ INFO] [1687252164.844848811, 0.201000000]: Camera Plugin (ns = r1) , set to "" [ INFO] [1687252164.877592203, 0.201000000]: Laser Plugin: The 'robotNamespace' param was empty [ INFO] [1687252164.877626964, 0.201000000]: Starting Laser Plugin (ns = r1) [ INFO] [1687252164.877974550, 0.201000000]: Laser Plugin (ns = r1) , set to "" [ INFO] [1687252165.479247355, 0.201000000]: Velodyne laser plugin missing , defaults to no clipping [INFO] [1687252165.479398, 0.201000]: Spawn status: SpawnModel: Successfully spawned entity [ INFO] [1687252165.480236128, 0.201000000]: Velodyne laser plugin ready, 16 lasers [ INFO] [1687252165.583246512, 0.201000000]: Starting plugin DiffDrive(ns = r1/) [ INFO] [1687252165.583296540, 0.201000000]: DiffDrive(ns = r1/): = Debug [ INFO] [1687252165.583432442, 0.201000000]: DiffDrive(ns = r1/): = [DEBUG] [1687252165.583453790, 0.201000000]: DiffDrive(ns = r1/): = cmd_vel [DEBUG] [1687252165.583460632, 0.201000000]: DiffDrive(ns = r1/): = odom [DEBUG] [1687252165.583468916, 0.201000000]: DiffDrive(ns = r1/): = odom [DEBUG] [1687252165.583474517, 0.201000000]: DiffDrive(ns = r1/): = base_link [DEBUG] [1687252165.583492367, 0.201000000]: DiffDrive(ns = r1/): = false [ WARN] [1687252165.583501721, 0.201000000]: DiffDrive(ns = r1/): missing default is true [DEBUG] [1687252165.583511232, 0.201000000]: DiffDrive(ns = r1/): = true [DEBUG] [1687252165.583536444, 0.201000000]: DiffDrive(ns = r1/): = 0.29999999999999999 [DEBUG] [1687252165.583544311, 0.201000000]: DiffDrive(ns = r1/): = 0.17999999999999999 [DEBUG] [1687252165.583552498, 0.201000000]: DiffDrive(ns = r1/): = 1.8 [DEBUG] [1687252165.583559582, 0.201000000]: DiffDrive(ns = r1/): = 20 [DEBUG] [1687252165.583568714, 0.201000000]: DiffDrive(ns = r1/): = 50 [DEBUG] [1687252165.583591853, 0.201000000]: DiffDrive(ns = r1/): = world := 1 [DEBUG] [1687252165.583606844, 0.201000000]: DiffDrive(ns = r1/): = left_hub_joint [DEBUG] [1687252165.583614564, 0.201000000]: DiffDrive(ns = r1/): = right_hub_joint [ WARN] [1687252165.583628249, 0.201000000]: GazeboRosDiffDrive Plugin (ns = ) missing , defaults to 1 [ INFO] [1687252165.583819721, 0.201000000]: DiffDrive(ns = r1/): Advertise joint_states [ INFO] [1687252165.584040658, 0.201000000]: DiffDrive(ns = r1/): Try to subscribe to cmd_vel [ INFO] [1687252165.584594780, 0.201000000]: DiffDrive(ns = r1/): Subscribe to cmd_vel [ INFO] [1687252165.584713240, 0.201000000]: DiffDrive(ns = r1/): Advertise odom on odom [ INFO] [1687252165.587231422, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: chassis_swivel_joint [ INFO] [1687252165.587241526, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: swivel_wheel_joint [ INFO] [1687252165.587250037, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: left_hub_joint [ INFO] [1687252165.587259284, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: right_hub_joint [ INFO] [1687252165.587267221, 0.201000000]: Starting GazeboRosJointStatePublisher Plugin (ns = r1/)!, parent name: r1 [DEBUG] [1687252165.592487958, 0.206000000]: Trying to publish message of type [sensor_msgs/CameraInfo/c9a58c1b0b154e0e6da7578cb991d214] on a publisher with type [sensor_msgs/CameraInfo/c9a58c1b0b154e0e6da7578cb991d214] [DEBUG] [1687252165.597151890, 0.211000000]: Trying to publish message of type [sensor_msgs/LaserScan/90c7ef2dc6895d81024acba2ac42f369] on a publisher with type [sensor_msgs/LaserScan/90c7ef2dc6895d81024acba2ac42f369] [DEBUG] [1687252165.611553998, 0.222000000]: Trying to publish message of type [nav_msgs/Odometry/cd5e73d190d741a2f92e81eda573aca7] on a publisher with type [nav_msgs/Odometry/cd5e73d190d741a2f92e81eda573aca7] [DEBUG] [1687252165.611583253, 0.222000000]: Trying to publish message of type [sensor_msgs/JointState/3066dcd76a6cfaef579bd0f34173e9fd] on a publisher with type [sensor_msgs/JointState/3066dcd76a6cfaef579bd0f34173e9fd] [urdf_spawner-3] process has finished cleanly log file: /home/hzx/.ros/log/2559db34-0f4a-11ee-bf50-bb6f994d1be6/urdf_spawner-3*.log Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 1: -253.386186, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 2: -256.596358, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 3: -251.217168, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 4: -253.167011, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 5: -254.447124, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 6: -252.074390, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 7: -250.687688, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 8: -255.268713, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 9: -251.536725, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 10: -252.257719, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 11: -250.500000, 0.000000 ..............................................

hzxBuaa commented 1 year ago

Average Reward over 10 Evaluation Episodes, Epoch 12: -255.689875, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 13: -256.942230, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 14: -258.425072, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 15: -260.090454, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 16: -256.700221, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 17: -252.317310, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 18: -254.078768, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 19: -252.697328, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 20: -254.145063, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 21: -259.087948, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 22: -253.587934, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 23: -252.160117, 0.000000 .............................................. Validating .............................................. Average Reward over 10 Evaluation Episodes, Epoch 24: -252.280001, 0.000000 ..............................................

reiniscimurs commented 1 year ago

I do not see any errors in the output log here so the map loading should be fine. Do you see your changes in the gazebo simulator and does it appear the way you would expect?

One thing to try is to change the seed and train network with other random initialization. You can do that here: https://github.com/reiniscimurs/DRL-robot-navigation/blob/943186fb7f1890700ce215951e92d5cb92031d14/TD3/train_velodyne_td3.py#L221

hzxBuaa commented 1 year ago

ok, thank you very much!