reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
571 stars 119 forks source link

Problems with visualization after switching robot #51

Closed PinkViolet closed 1 year ago

PinkViolet commented 1 year ago

Hi Reiniscimurs,

Thanks for your great project again. After I successfully trained the model, I started to switch a robot in simulation. Some problems occurred during modification. Please help me and thanks in advance.

Here are my system setups:

  1. After switching robot, my robot frame is fixed at the center and other frames such as /scan and /velodyne_points are moving around. I tried switching the topics set in Global Options - Fixed Frame in Rviz, but none of them are expected. How can I set up a fixed global coordinates to solve the problem? Screenshot from 2023-03-21 10-18-30

  2. I wish to enable Gazebo gui, how should I do it. I tried changing this line to true, but the gui do not show up.

  3. The new robot I'm using is bigger than pioneer3dx so the robot collided with objects initially. Then I changed COLLISION_DIST to 0.9. It appeared that the robot started resetting or teleporting more often while it was still able to collide with objects. Do you have any suggestions to this problem?

Here might be some useful materials to help your analysis:

reiniscimurs commented 1 year ago

Hi,

Thanks for the extensive information with rqt tree and graph. It is appreciated.

  1. What does the robot differential drive plugin call look like? (In this repo it is located: https://github.com/reiniscimurs/DRL-robot-navigation/blob/943186fb7f1890700ce215951e92d5cb92031d14/catkin_ws/src/multi_robot_scenario/xacro/p3dx/pioneer3dx_plugins.xacro#L18). You might have an argument <publishOdomTF>true</publishOdomTF> that you would want to test turning off or on.
  2. That would not be the right argument to set for GUI. Take a look for the 2 methods here: https://medium.com/@reinis_86651/deep-reinforcement-learning-in-mobile-robot-navigation-tutorial-part5-some-extra-stuff-b744852345ac
  3. It might be that with the increased collision distance, a collision is detected as soon as a robot is spawned next to an obstacle. In this repo, I created a "dead zone" around static obstacles manually and it is static. You might want to increase this "dead zone" (or better yet, reformat this part of the code to something more sensible), so that a robot would spawn at least 0.9 away from an obstacle. The code in question: https://github.com/reiniscimurs/DRL-robot-navigation/blob/943186fb7f1890700ce215951e92d5cb92031d14/TD3/velodyne_env.py#L26
PinkViolet commented 1 year ago

Hi Reiniscimurs,

Thanks for your explaination.

1) What does the robot differential drive plugin call look like? A1: The team who creates the robot model I'm using customize the drive plugin. If you are looking for /odom, they publish the robot state in /chs_odom. In this case, should I create a static_tf_tree for /chs_odom, so that I can fix the Rviz visualization?

Replies to 2) and 3): Thanks for the resources. I managed to resolve these problems. Actually, the velodyne plugin cannot detect the objects which are closed to the robot in my scenario, thus the robot tends to collide with objects. I solve this problem by using LaserScan instead.

Additional questions:

  1. How do I increase training speed, other than increase ? Will increasing batch_size of ReplayBuffer work?

Thanks in advance.

reiniscimurs commented 1 year ago

I mostly meant that there is a driver plugin used by the robot and if it publishes the odomTF or something similar. That might be something to check. It might also be that some naming does not match in the tree if published names are different.

As for training speed. Sure, if your resources allow, larger batch sizes might help. Other methods might be to use a different DRL algorithm like PPO or similar. Some filtering of the replay buffer or some sort of curriculum training might also speed up the process.