reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
509 stars 107 forks source link

Simulation problem #1

Closed hjj-666 closed 2 years ago

hjj-666 commented 2 years ago

When I run the python velodyne_td3.py command, the following error appears

hjj-666 commented 2 years ago

[libprotobuf FATAL google/protobuf/stubs/common.cc:61] This program requires version 3.14.0 of the Protocol Buffer runtime library, but the installed version is 2.6.1. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "google/protobuf/any.pb.cc".) terminate called after throwing an instance of 'google::protobuf::FatalException' what(): This program requires version 3.14.0 of the Protocol Buffer runtime library, but the installed version is 2.6.1. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "google/protobuf/any.pb.cc".) Aborted (core dumped) [gazebo-1] process has died [pid 31116, exit code 134, cmd /home/user/gym-gazebo/gym_gazebo/envs/installation/catkin_ws/src/gazebo_ros_pkgs/gazebo_ros/scripts/gzserver -e ode TD3-2.world name:=gazebo log:=/home/user/.ros/log/62fe2f2e-3663-11ec-9092-e0e1a951fbad/gazebo-1.log]. log file: /home/user/.ros/log/62fe2f2e-3663-11ec-9092-e0e1a951fbad/gazebo-1.log [r1/urdf_spawner-2] process has finished cleanly log file: /home/user/.ros/log/62fe2f2e-3663-11ec-9092-e0e1a951fbad/r1-urdf_spawner-2.log [gazebo-1] restarting process process[gazebo-1]: started with pid [31658] [ INFO] [1635256108.664795970]: Finished loading Gazebo ROS API Plugin. [ INFO] [1635256108.665881041]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting... [ INFO] [1635256109.568563987]: waitForService: Service [/gazebo/set_physics_properties] is now available. [ INFO] [1635256109.612805600, 1210.735000000]: Physics dynamic reconfigure ready. [ERROR] [1635256109.823214551, 1210.939000000]: Updating ModelState: model [r1] does not exist

hjj-666 commented 2 years ago

Have you ever encountered a similar error and how to solve this

reiniscimurs commented 2 years ago

Seems to me that you have an older version of protobuf. Try updating or re-installing it. Haven't encountered that issue before, but as I said, my system is all set up to support this implementation so I don't know what dependencies are missing on a fresh install. I will try to test it out properly when I have more time and see if I encounter this issue.

hjj-666 commented 2 years ago

Thank you very much, I will try again if I can solve this problem

hjj-666 commented 2 years ago

Now running the program velodyne_td3.py no error is reported, but I want to use gzclient to view the environment of gazebo but an error is reported. Do you have a solution for this?

hjj-666 commented 2 years ago

reiniscimurs commented 2 years ago

What version of gazebo and ROS are you using? Is the simulated camera image showing up in Rviz? Check if it works without using a simulated RGB camera: got to multi_robot_scenario/xacro/p3dx/pioneer3dx.xacro And comment out these lines for the camera plugin: <xacro:include filename="$(find multi_robot_scenario)/xacro/camera/camera.xacro"/> <xacro:cam name="front_camera" parent="chassis" xyz="0.025 0 0.32" rpy="0 0 0" />

<xacro:include filename="$(find multi_robot_scenario)/xacro/camera/cameraD.xacro"/> <xacro:camD name="cameraD" parent="chassis" xyz="0.025 0 0.32" rpy="0 0 0" />

Additionally, you can start GUI client by going into multi_robot_scenario/launch/empty_world.launch and changing the argument of "gui" to "true".

hjj-666 commented 2 years ago

I use gazebo7, ROS is kinetic, and the image of the camera can be displayed in rviz. Now I changed the parameter of "gui" to "true", and found that the environment in gazebo can be displayed normally. Thank you very much for your help

hjj-666 commented 2 years ago

Does it really take you 8 hours to train for 800 episodes? The configuration of the server I am using is similar to the one mentioned in your paper. It is NVIDIA GTX1080ti graphics card, 32 GB of RAM, and Intel Core i7-6850K CPU. But I found that I trained for about 3 hours and only trained 10 episodes

reiniscimurs commented 2 years ago

By 10 episodes do you mean 10 validation runs? If so, those would be epochs instead of episodes. 1 episode is considered from spawning the robot until it reaches the goal or crashes into an obstacle.

hjj-666 commented 2 years ago

thank you for your reply. Then I should have misunderstood the episode in your program

hjj-666 commented 2 years ago

hjj-666 commented 2 years ago

May I ask if the collision rate after training until the episode 240 that is close to the episode 240 in the readme you wrote is still 0.3 and not close to 0 in your picture, is this normal?

reiniscimurs commented 2 years ago

There is randomness to the training process so it might not always obtain the same result after the same amount of runs. You might want to either restart the training process or change the random seed for initialization. See if that helps.

hjj-666 commented 2 years ago

Thank you, I will try to modify the value of the random seed later. In addition, I have another question is how many episodes you have validated before using the training results on the actual mobile robot.

reiniscimurs commented 2 years ago

For the implementation in the paper, I ran the training for about 800 episodes. Validation is just testing how well the network performs at any given time, and in this implementation, it is just information for the programmer and has no effect on the training itself. So when I ran the training for the method in the paper, there were no validation sets, just pure training for about 800 episodes. However, in practice, there is no set number of episodes that you have to run before you can transfer the network to a real robot as the convergence rate and how well the network performs will depend on multiple factors - random initialization, learning rate, reward function, number of parameters, replay buffer length and others. So I would suggest just training the network, looking at the data (have the q-values converged) and visually how well does the robot perform, and deciding when to stop the training based on that. I have run networks with 400, 500, 800, 1600 and various other numbers of training episodes on real robot, so there is no set number that you have to reach.

hjj-666 commented 2 years ago

Thank you for your reply. Next, I will try to train different steps according to my own robot.

AgentEXPL commented 2 years ago

For the implementation in the paper, I ran the training for about 800 episodes. Validation is just testing how well the network performs at any given time, and in this implementation, it is just information for the programmer and has no effect on the training itself. So when I ran the training for the method in the paper, there were no validation sets, just pure training for about 800 episodes. However, in practice, there is no set number of episodes that you have to run before you can transfer the network to a real robot as the convergence rate and how well the network performs will depend on multiple factors - random initialization, learning rate, reward function, number of parameters, replay buffer length and others. So I would suggest just training the network, looking at the data (have the q-values converged) and visually how well does the robot perform, and deciding when to stop the training based on that. I have run networks with 400, 500, 800, 1600 and various other numbers of training episodes on real robot, so there is no set number that you have to reach.

Hi, are there some good ways to look at whether the q-values, which are related to continuous action values, are converged? Through the average Q-value, or other metrics?

reiniscimurs commented 2 years ago

Generally speaking, loss should be a good indicator that the Q values have converged. Meaning, that the values we get from the critic network are not any different from the target network. That suggests that they do not change anymore. Another way is to look at the maximal experienced Q value during any validation run. Once the max Q stops growing, we have an implication that the Q values have converged as the same or similar high-valued state (the terminal state of arriving at the goal) is constantly evaluated with the same value.

I probably will not have the time to implement and test this through tensorboard any time soon, but contributions are always welcome.

AgentEXPL commented 2 years ago

Thanks a lot for your excellent suggestions. Next, I will try to implement your suggestion with tensorboard. If it works, I will be back for sharing the code.

nfhe commented 2 years ago

Hi,man,thank you for you code.I have a problem.Do you use Anaconda to create a new virtual environment? I use Anaconda to create a new virtual environment Python 3.6 and python IDE, but I can't find rospkg in this environment.have you ever encountered a similar error and how to solve this.thank you.

nfhe commented 2 years ago

[libprotobuf FATAL google/protobuf/stubs/common.cc:61] This program requires version 3.14.0 of the Protocol Buffer runtime library, but the installed version is 2.6.1. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "google/protobuf/any.pb.cc".) terminate called after throwing an instance of 'google::protobuf::FatalException' what(): This program requires version 3.14.0 of the Protocol Buffer runtime library, but the installed version is 2.6.1. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "google/protobuf/any.pb.cc".) Aborted (core dumped) [gazebo-1] process has died [pid 31116, exit code 134, cmd /home/user/gym-gazebo/gym_gazebo/envs/installation/catkin_ws/src/gazebo_ros_pkgs/gazebo_ros/scripts/gzserver -e ode TD3-2.world name:=gazebo log:=/home/user/.ros/log/62fe2f2e-3663-11ec-9092-e0e1a951fbad/gazebo-1.log]. log file: /home/user/.ros/log/62fe2f2e-3663-11ec-9092-e0e1a951fbad/gazebo-1.log [r1/urdf_spawner-2] process has finished cleanly log file: /home/user/.ros/log/62fe2f2e-3663-11ec-9092-e0e1a951fbad/r1-urdf_spawner-2.log [gazebo-1] restarting process process[gazebo-1]: started with pid [31658] [ INFO] [1635256108.664795970]: Finished loading Gazebo ROS API Plugin. [ INFO] [1635256108.665881041]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting... [ INFO] [1635256109.568563987]: waitForService: Service [/gazebo/set_physics_properties] is now available. [ INFO] [1635256109.612805600, 1210.735000000]: Physics dynamic reconfigure ready. [ERROR] [1635256109.823214551, 1210.939000000]: Updating ModelState: model [r1] does not exist

Thank you for your reply. Next, I will try to train different steps according to my own robot.

Hi,man,Can I add you a wechat?

reiniscimurs commented 2 years ago

Hi,man,thank you for you code.I have a problem.Do you use Anaconda to create a new virtual environment? I use Anaconda to create a new virtual environment Python 3.6 and python IDE, but I can't find rospkg in this environment.have you ever encountered a similar error and how to solve this.thank you.

Check if your ROS is installed properly, initialized, and sourced in the new anaconda virtual environment. If you can't run roscore in your environment, then it is most likely an issue with your installation.

nfhe commented 2 years ago

Hi,man,thank you for you code.I have a problem.Do you use Anaconda to create a new virtual environment? I use Anaconda to create a new virtual environment Python 3.6 and python IDE, but I can't find rospkg in this environment.have you ever encountered a similar error and how to solve this.thank you.

Check if your ROS is installed properly, initialized, and sourced in the new anaconda virtual environment. If you can't run roscore in your environment, then it is most likely an issue with your installation.

Thank you for your reply.

threetwoone-w commented 2 years ago

When I run the python velodyne_td3.py command, the error appears QObject::connect: Cannot queue arguments of type ‘QVector’ (Make sure ‘QVector’ is registered using qRegisterMetaType().)

reiniscimurs commented 2 years ago

When I run the python velodyne_td3.py command, the error appears QObject::connect: Cannot queue arguments of type ‘QVector’ (Make sure ‘QVector’ is registered using qRegisterMetaType().)

That is not an error that I have encountered.

There is not enough information to go on. Please provide your setup and how you are trying to run the program. Ideally you should open a new issue for such a case.

DevAB-Git commented 2 years ago

Hi, Thank you for sharing your code. I'm new to ROS/Gazebo. I have installed ROS noetic and played with some tutorials. We are aiming to develop some path planning algorithms inspired by hemispheric lateralization, I found your work interesting. Currently, I'm trying to replicate path planning using your code. To make the things simple, I have just loaded the Gazebo environment. It is observed that the robot does not move in the rviz, however, it is moving in Gazebo. The robot model, camera, and laser scan all show status error, e.g. No transform from [r1/base_link] to [base_link]. Thanks in advance.

my file velodyne_ab.py looks like

############################################# import os import time import numpy as np from velodyne_env import GazeboEnv

variables initialization

----

env = GazeboEnv('multi_robot_scenario.launch', 1, 1, 1)

for i in range(10000): print("AB Step") if(i%100 == 0 and i > 0): env.reset()

action = (1 + np.random.normal(0, expl_noise, size=action_dim)).clip(-max_action, max_action)
a_in = [(action[0] + 1) / 2, action[1]]
next_state, reward, done, target = env.step(a_in)