reiniscimurs / GDAE

A goal-driven autonomous exploration through deep reinforcement learning (ICRA 2022) system that combines reactive and planned robot navigation in unknown environments

130 stars 15 forks source link

problem with simultion #10

Open 525753936 opened 1 year ago

525753936 commented 1 year ago

Hi, thanks for sharing your marvelous work! I got some issues with the simulation using GADE, becasue I don't have an actual car.

I have trained the model from your another work "DRL navigation", and changed the according code in https://github.com/reiniscimurs/GDAE/blob/fc793eda8de23bed98ba3acd32908843c535510f/Code/GDAM.py, and I modified the topic depicited in GADM_env.py which alligned with my own robot topic as shown.

Meanwhile the SLAM_TOOBOX is used for map construction, but It seemed that there was no goal is published and the robot cannot move at all. I checked the self.nodes and found it was not empty.

Here were my RVIZ screenshot and rqt_graph, and I tried to run " rostopic echo / /global_goal_publisher ", but no response received. Could your give me some advice about this problem?

reiniscimurs commented 1 year ago

Hi,

have you tried visualizing all the nodes within the rviz? All node topics are published and you should have the possibility to visualize them as markers in rviz. Then you can see if there are any nodes that are being recorded and processed.

What are your log outputs? If you would not have any goal nodes or any nodes to select goals from the program would encounter an indexing error.

525753936 commented 1 year ago

Thanks for your patience and time. In terminal 1, I run roslaunch slam_toolbox online_sync.launch , and here is the output.

In terminal 2, I run python GDAM.py, and here is the modified code.

`import torch import torch.nn as nn import torch.nn.functional as F import numpy as np from GDAM_env import ImplementEnv from GDAM_args import d_args import time

class Actor(nn.Module): def init(self): super(Actor, self).init()

    self.layer_1 = nn.Linear(24, 800)
    self.layer_2 = nn.Linear(800, 600)
    self.layer_3 = nn.Linear(600, 2)
    self.tanh = nn.Tanh()

def forward(self, s):
    s = F.relu(self.layer_1(s))
    s = F.relu(self.layer_2(s))
    a = self.tanh(self.layer_3(s))
    return a

env = ImplementEnv(d_args)

Set the parameters for the implementation

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # cuda or cpu

# Load the model checkpoint

file_name = "/home/zhengchen/DRL-robot-navigation/TD3/pytorch_models/TD3_velodyne_actor.pth" actor = Actor() try: actor.load_state_dict(torch.load(file_name)) except: raise ValueError("Could not load the stored model parameters")

print("running here") print("running here") print("running here")

while True:

action = torch.tensor([0.0, 0.0])
s2, toGoal = env.step(action)

s = np.append(s2, toGoal)
s = np.append(s, action.numpy())
print("running here 222")
print("running here 222")
print("running here 222")
print("running here 222")
while True:

    a = actor(torch.tensor(s, dtype=torch.float).unsqueeze(0))
    aIn = a.clone()
    aIn[0, 0] = (aIn[0, 0] + 1) / 4
    print(aIn)
    print(aIn)
    print(aIn)
    print(aIn)
    print(aIn)
    print(aIn)
    print(aIn)
    print(aIn)
    print(aIn)
    s2, toGoal = env.step(aIn[0])
    s = np.append(s2, a[0].detach().numpy())
    s = np.append(s, toGoal)`

and here is the output.

I didn't see any marker like the project https://github.com/reiniscimurs/DRL-robot-navigation as shown in Rviz. 2023-08-28 09-07-28 的屏幕截图

I checked that I setted the goal as x=4, y =-1.75 in GDAM_args.py. And I used the gazebo control lib as same as the /DRL-robot-navigation project.

I am confused there is no output like "running here" or "running here 222" in terminal 1, but after I used "Ctrl + C "to stop this terminal, here are some outputs as shown. 2023-08-28 09-08-56 的屏幕截图

It seems that the number of lidar data from ROS topic /r1/front_laser/scan 360. The laser_in[381:420] laser_in[421:460], and further list are empty in "GDAM_env.py", line 211. I am sure that I didn't change the code of the multi_robot_scneario.launch in project DRL-robot-navigation. Could your give me some suggestion to run it correctly?

reiniscimurs commented 1 year ago

Thanks for the images and code snippets.

I am a bit confused at what point does the ValueError: min().... appear. Does it appear when you execute the program or only after killing it with Ctrl+c?

In any case, it looks like the program can load the weights and get into the step method of the environment. I assume it could be some sensor issue here and some compatibility thing. Could you provide the full step method code in your implementation? In GDAM we are expecting to have 2 rpLidars as inputs with 1440 laser reading each. From these 1440 values we then take the 720 values that are 180 degree field of view in front of the laser. I assume your setup looks different, so there could be some issue how the laser data is obtained as this code would need to be adapted. It just could be that it is stuck in one of those while loops. What lidar are you using and how many sensor values does it have?

525753936 commented 1 year ago

Thanks for your time and patience. ValueError: min()....only appears after I execute the command "Ctrl+c". The lidar sensor I use is as same as the lidar sensor defined in the DRL-navigation project. I didn't change anything at all. And I just subscribe the same topic "r1/front_laser/scan" for the two laser_in as inputs. And It is true that the lidar data only contains 380 values, cause the code "min(laser_in[381:420]), min(laser_in[421:460]),.... " in line 201 cannot execute correctly. Is the lidar you use have 360 degree field of view and then you reduce the values ? Cause I think the lidar used in DRL-nav project only contains 360 values, which may be the cause for this stuck.

reiniscimurs commented 1 year ago

So the lidars for GDAM are real physical RpLidar sensors. This means you automatically get 360 degree range with 1440 samples and have to limit the range by using only the frontal 180 degrees. This is done by collecting only the front readings in step method: https://github.com/reiniscimurs/GDAE/blob/main/Code/GDAM_env.py#L184-L196

In DRL-robot-navigation we use simulated lidars. There you can constrain the field of view in the sensor setup: https://github.com/reiniscimurs/DRL-robot-navigation/blob/main/catkin_ws/src/multi_robot_scenario/xacro/laser/hokuyo.xacro#L42-L43 Additionally, you can set the amount of samples you will get out of the simulated sensor. In order to use the simulated sensor in GDAM, you will have to properly update the code for sensors so that it makes sense in the implementation. You can use a single lidar sensor and do not require both as in GDAM current code. But you will have to make sure that the gaps are sliced properly.

525753936 commented 1 year ago

Hi, thanks for your suggestion. I modified the lidar's sample values for implemention, and there is no warning or error for that. But the robot still cannot move. I checked whether the goal was published by the command " rostopic echo /move_base/TrajectoryPlannerROS/global_plan ", and the output was" WARNING: no messages received and simulated time is active. Is /clock being published?" It seemed there is no goal generated, which I believe is the reason for the stuck. In the GADM_env.py self.vel_pub = rospy.Publisher('/cmd_vel', Twist, queue_size=1) self.vel_pub_rob = rospy.Publisher('/r1/cmd_vel', Twist, queue_size=1) topic_nodes_pub = 'vis_mark_array_node' self.nodes_pub = rospy.Publisher(topic_nodes_pub, Marker, queue_size=3) topic_nodes_pub_closed = 'vis_mark_array_node_closed' self.nodes_pub_closed = rospy.Publisher(topic_nodes_pub_closed, Marker, queue_size=3) topic_map_nodes = 'vis_mark_array_map_nodes' self.map_nodes_viz = rospy.Publisher(topic_map_nodes, Marker, queue_size=3) topic = 'vis_mark_array' self.publisher = rospy.Publisher(topic, MarkerArray, queue_size=3) **self.global_goal_publisher = rospy.Publisher('global_goal_publisher', MarkerArray, queue_size=1)** self.navLaser = rospy.Subscriber('/r1/front_laser/scan', LaserScan, self.Laser_callback, queue_size=1) self.navLaserTop = rospy.Subscriber('/r1/front_laser/scan', LaserScan, self.Laser_callback_Top, queue_size=1) self.navOdom = rospy.Subscriber('/r1/odom', Odometry, self.Odom_callback, queue_size=1) **self.path = rospy.Subscriber('/move_base/TrajectoryPlannerROS/global_plan', Path, self.path_callback, queue_size=1) self.client = actionlib.SimpleActionClient('move_base', MoveBaseAction) self.client.wait_for_server() self.listener = tranf.TransformListener()** I don't understand the meaning of the subscriber self.path and self.client. Are they are published by another ROS package? Plus, I cannot echo any data from the topic of self.global_goal_publisher, but I can echo the goal point published by the DRL-navigation project https://github.com/reiniscimurs/DRL-robot-navigation/blob/main/TD3/velodyne_env.py self.publisher = rospy.Publisher("goal_point", MarkerArray, queue_size=3) self.publisher2 = rospy.Publisher("linear_velocity", MarkerArray, queue_size=1) self.publisher3 = rospy.Publisher("angular_velocity", MarkerArray, **queue_size=1)** Can you give me some advice for this problem? Thanks.

525753936 commented 1 year ago

Sorry for the code messy. Here is an image for the publisher and subscriber. 2023-08-29 16-07-23 的屏幕截图

reiniscimurs commented 1 year ago

Ok that is actually a good catch. Yes, move base is a path planner that you need to launch beforehand: https://wiki.ros.org/move_base and is part of navigation stack with its own setup: https://wiki.ros.org/navigation/Tutorials/RobotSetup

Essentially, there is a callback to get the generated path from the move_base plugin in self.path. But in order to get it, you require a client that will communicate with this plugin in self.client. This client needs to establish communication with the move_base server with self.client.wait_for_server(). What could be happening is that you do not have a server for communication running, so the client is perpetually waiting to establish communication.

There are 2 things you could try:

Set the timeout for the self.client.wait_for_server() with something similar to self.client.wait_for_server(timeout=Duration(5)). I do not recall actual python code for this, but see the c++ implementation as guidance here: https://wiki.ros.org/actionlib_tutorials/Tutorials/SimpleActionClient This should allow you to see if the code is stuck waiting for the move_base server
Remove the move_base and path planner code. It is not strictly necessary as it only gives a couple of more points of interest along the way the robot has already explored. This would not be critical to getting the code to run. You can do this by removing any use of self.path and self.client and the related functionality.

525753936 commented 1 year ago

Thank you so much for your advice! It finally worked ! I added a move_base plugin to my workspace with supported .yaml files. Then I can run in simulation perfectly. 2023-08-30 09-29-03 的屏幕截图 One to go is the trained model seems not to be good, and the robot is simply rotating. I will try to train it for another time and see how it comes. Thank your again for sharing your amazing work and your help.

reiniscimurs commented 1 year ago

That is great!

I still do not see any other possible nodes though. These should be added based on laser scans and shown as blue dots in rviz. For the network performance, I would still first suggest to check if the laser input is actually what you expect. Without changing how the data is read and processed in the step method I would not expect the network to work either.

zahra370 commented 1 year ago

Hi, thanks for sharing your marvelous work! I got some issues with the simulation using GADE, becasue I don't have an actual car.

I have trained the model from your another work "DRL navigation", and changed the according code in https://github.com/reiniscimurs/GDAE/blob/fc793eda8de23bed98ba3acd32908843c535510f/Code/GDAM.py, and I modified the topic depicited in GADM_env.py which alligned with my own robot topic as shown.

Meanwhile the SLAM_TOOBOX is used for map construction, but It seemed that there was no goal is published and the robot cannot move at all. I checked the self.nodes and found it was not empty.

Here were my RVIZ screenshot and rqt_graph, and I tried to run " rostopic echo / /global_goal_publisher ", but no response received. Could your give me some advice about this problem?

At this point, I'm getting lots of bugs, the link above isn't working, could you tell me which part of the code needs to be modified? I've trained the model, but how do I use it with GADE

sattlelite commented 1 month ago

I have been trying for a long time to get this work running in my computer simulation environment, my environment is 'ubuntu20.04,rosnoetic'. I installed the required packages following your 'README' file. In the test, I used the command to set the simulated interface "export ROS_HOSTNAME=localhost export ROS_MASTER_URI=http://localhost:11311 export ROS_PORT_SIM=11311 export GAZEBO_RESOURCE_PATH=~/GDAM_ws/src/multi_robot_scenario/launch", and then the robot remained stationary. Then I used my own map. Then I found that the robot did not move rviz runinfo1 runinfo2 runinfo3 runinfo4

I manually killed the gmapping node and started the slam-toolbox node, below is rqt_graph. rqt3 I manually killed the gmapping node and started the slam-toolbox node Can you tell me where the problem might be?

reiniscimurs commented 1 month ago

My guess is that if there is no movement and no error messages that it would be stuck at the point self.client.wait_for_server() You could try to remove all calls to self.clientas that is not required to actually run the program.

sattlelite commented 4 weeks ago

Hello Mr. Reiniscimurs, I have tried your suggestion and reviewed other publications under this topic. I commented waitforservice in the GDAM env file. wait for service

Then execute the following command to start this program. my readme Then, through rqt, it was discovered that the gym node had disappeared. rqt4 I suspect the problem lies in the configuration of the ‘move_base’. Then I cancelled the annotation and ended the current character. Then restart start info start info2 and check 'ros node info'. According to the DRL navigation you provided, the model I trained had 91 successful navigation attempts every 100 resets. If there are no issues with the model and other ROS packages are configured correctly, then the map reconstructed by SLAM should appear even if the robot is prohibited from moving. I will replace slam gmapping directly with slam toolbox. slam launch ![Uploading rqt graph.png…]()

I don't know if the problem looks the same as what I described, I need your advice，please, thank you so much.

reiniscimurs commented 4 weeks ago

First thing you probably need to solve is the error when loading a model as it appears in your logs there. If there is an error, the model will not load and no motion will be carried out. Also are you trying to load the model from the drl repo directly into this repo? Note that this will not work directly as the drl repo has a different input space as well as it's written in pytorch but here it's in tensorflow.