reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.
MIT License
571 stars 119 forks source link

Problem with training #97

Closed Ethan0207 closed 5 months ago

Ethan0207 commented 9 months ago

Hello. These days, I try to solve my problems , but it still has some mistakes .These mistakes are similar to the issue of "Problem with the starting training" mentioned by the friend below. When I try to do it without exporting this commands; export ROS_HOSTNAME=localhost export ROS_MASTER_URI=http://localhost:11311/ export ROS_PORT_SIM=11311 export GAZEBO_RESOURCE_PATH=~/DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/launch source ~/.bashrc cd ~/DRL-robot-navigation/catkin_ws source devel_isolated/setup.bash In my terminal . it says roscore on the top and nothing happens. Ubuntu12 28 1 if I export this In my terminal . it still says roscore on the top and nothing happens. Ubuntu12 28 2 So I am stuck at this point and I can't train the agent . If you can help me with this problem I would be so glad. Thanks in advance. additionally this is my .bashrc file Ubuntu12 28 3

reiniscimurs commented 9 months ago

Hi,

Is this the first time starting the training? You might have missing gazebo models that the simulator is trying to load. What could be happening is that gazebo tries to load the world file and the models in it, but you do not have all the models available locally, so gazebo will download them. This takes some time and there are no indicators for this.

You could try to open the world file directly in gazebo. Alternatively, just start training and wait for some 20 to 30 minutes (it will download the models in the background), then see if the execution has started.

If this does not help, let me know.

Ethan0207 commented 9 months ago

Hi,thank you for your reply. Today, I just start training and wait for some 20 to 30 minutes, but it has not start. And then I try to open the world file directly in gazebo. 12 29 2 12 29 3 12 29 1 (additionally this is my TD_world.launch file) 12 29 4 but it still has not start. 12 29 6 And I find annother question. When I "roslaunch multi_robot_scenario pioneer3dx.gazebo.launch", it just stop here. 12 29 5 I must "conda deactivate" ,then the p3dx appears. Do you think that's a contributing factor?

oycool commented 9 months ago

The previously running node was not completely terminated. To kill the training process: killall -9 rosout roslaunch rosmaster gzserver nodelet robot_state_publisher gzclient python python3

reiniscimurs commented 9 months ago

Hi,

as @cd310105974 pointed out the error message in one of the images most likely is because you did not properly kill your previous run.

For the main issue though, i would suggest to see if the way to source the locations is proper in your conda env as well. Especially, if you need to use "localhost". I can see that your gazebo simulator is not starting from calling it in the training script and the line subprocess.Popen(["roslaunch", "-p", port, fullpath]) does not execute for you.

reiniscimurs commented 9 months ago

Also see solution here as well: https://github.com/reiniscimurs/DRL-robot-navigation/issues/83

Ethan0207 commented 9 months ago

Hi, thank you for your reply. Firstly, I kill my previous run, but it is still failed. I delete the commands in my bashrc file about conda and redo the steps . My ros still doesn't start. It is strange. 1 2-1

reiniscimurs commented 9 months ago

I'd suggest trying to set up the training without using a virtual env to see if it is specifically virtual env issue. Seems that there is some issue with setting up and running the repo in virtual env. I won't be able to set that up and test it anytime soon though.

Ethan0207 commented 9 months ago

Thank you for your reply .I have solved the problem. I uninstall the Anaconda and install pytorch and tensorboard. It can work. I guess the version of ros and Anaconda clashed. 1 3