TempleRAIL / drl_vo_nav

[T-RO 2023] DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
https://doi.org/10.1109/TRO.2023.3257549
GNU General Public License v3.0
113 stars 7 forks source link

Testing Trained Model on Real Robot #19

Closed FurkanEdizkan closed 2 months ago

FurkanEdizkan commented 4 months ago

Hi,

I have used your project to train a model. Now I want to make it run on actual robot, outside of simulation, however when I checked to code there were no ROS topic connection to ZED Camera topics or other related information. All related data coming directly from Gazebo environment, no direct sensory information feed into model when training and navigation demo on simulation.

How can I make the model run on a real world robot outside of the simulation?

(great paper with really nice open code by the way)


Edit:

I have read the closed issues and from https://github.com/TempleRAIL/drl_vo_nav/issues/11 I understand some essential codes are not shared to open-source because of commercial license if we want to try the code on hardware. 😅

Do you have any suggestions and go to points if we still want to use this project to train a model and use it on a hardware?

zzuxzt commented 4 months ago

Thanks for your interest in our work. Sorry for the inconvience on the hardware deployment. If you want to run it on the real robot, a simple way is to use the ZED2 camera rather than ZED1 camera, which can provide pedestrian tracking information directly. In that case, you do not need to used the Yolo & MHT to obstain it. Another temporary approach is to use only lidar information and ignore moving pedestrians (i.e. set the pedestrain maps to 0), the relevant deployment example can be found in our BARN challenge repository.

By the way, I plan to open source the ready-to-run code for hardware deployment using the ZED2 camera and 2D lidar sensors soon.

FurkanEdizkan commented 4 months ago

Thanks for getting back to me so quickly. So, it looks like I might have to dive into creating a pedestrian tracking module using YOLO and MHT. :sweat_smile: My robot doesn't have a ZED camera, I am using Intel RealSense D435i depth camera.

I'm thinking about adjusting the code to make it compatible with YOLO and MHT. If I can pull it off, it would solve my problem.


By the way, I plan to open source the ready-to-run code for hardware deployment using the ZED2 camera and 2D Lidar sensors soon

This would really help me since I would know where to implement the output of pedestrian tracking, since I might need to map the pedestrian locations.


I have 1 more question, since we are training with direct data from Gazebo, when we implement sensory information feed into navigation, should we need to train the model again? Since model trained with near perfect data and when run on real world sensory data will not be as perfect.

I have watched the test videos, I am just wondering trained model gets direct gazebo data and when that model implemented in to a real world robot can it work as expected?

zzuxzt commented 4 months ago

Yeah, but it is not limited to YOLO & MHT. You can use any detection & tracking module to provide pedestrian kinematics. D435 camera also works. The DRL-VO policy requires pedestrian kinematics as input, but does not really mind how to obtain it (although this may result in small differences).

For this question, you just need to encode the pedestrian information into pedestrian kinematic maps as I did in this repository or in the BARN Challenge repository. The pipeline is the same as using the ground truth pedestrian information.

Sim-to-real gap is always an important issue for robot navigation, and there are many influencing factors. As I said in the paper, there are many operations proposed in our DRL-VO policy that can help bridge the sim-to-real gap: 3D Gazebo simulator with imperfect processing pipeline settings, preprocessed data representation, and good reward function design. At least, in my paper setting, the trained DRL-VO policy from the simulator can directly deploy to the real world without any finetuning/retraining.

However, every learning-based control strategy has its own range of generalization capabilities, and the final policy performance also depends on training and final specific hardware deployment. Many researchers fine-tune it in the real world as well. So, depends on different situations, I would suggest you test your trained policy yourself. If you are not satisfied with its performance, you can improve it by fine-tuning or retraining using real-world data.

FurkanEdizkan commented 4 months ago

I understand, thank you for explaining! 😄

I will attempt to train a model with my robot in simulation using Gazebo data, and then create a solution for pedestrian tracking on a real-world robot.

FurkanEdizkan commented 3 months ago

I have been experimenting with DRL-VO, your solution presented in paper is realy great, however I have been limited with hardware. In your robot I suppose pedestrian locations are provided from ZED camera, which just give pedestrian location info so no extra calculation required from robot computer, however I am using a intel camera :worried:, I need to process darknet or other yolo, detect pedestrians bounding box, segment the person on detected box, get coordinated of detected pedestrian, apply transformations to get location of the pedestrian relative to the map. All sounds great but by robot computer will be burning by then :laughing:.

I want to only train model with using lidar data to make it navigate inside world with high pedestrian presence.

Where should I change/fix in order to only train model with lidar data.

zzuxzt commented 3 months ago

In fact, the pedestrian information in my experiments was calculated through the YOLO+MHT framework. But such complex procedure can be replaced with a ZED2 camera.

If you want to train the model using only lidar historical map, you can 1) directly set all the values in the pedestrian kinematic maps to 0 or 2) remove the pedestrian kinematic maps and modify the number of DRL-VO network input channels. For 1), you can refer to my deployment code for ICRA 2022 BARN Challenge. For 2), you can refer to the Lidar baseline without pedestrian kinematic maps in my IROS paper.

FurkanEdizkan commented 3 months ago

So you used robot computer to get pedestrian information, but didn't it put a huge load on robot pc? I was worried about running drl-vo with other codes and ROS navigation on robot pc. Because of pedestrian safety, of course there should/must be extra safety features against collusion and other dangerous acts, if max speed of robot increases we need to detect and act accordingly. Image frames drop if we have high load and if we detect a person on front for examples, robot might be too late to act. That is why I am trying to decrease the process load on my robot pc.

This makes me wonder lidar data comes mostly 1:1 with system however image data with yolo and other algorithms comes with 15-30 frames. I suppose when a person detected and if it is moving the real position and mapped positions are different, even though we are training to avoid heading direction of moving pedestrian, how impactfull using rgbd data if we don't have a strong computer.

My robot computer:

zzuxzt commented 3 months ago

Jetson AGX Xavier is very powerful as an embedded computer. I also used it in my robot and the camera processing pipeline is also around 15~30 frame/s. Only in super crowded scenes does it bring a huge load. But for me it is acceptable since I am not using it for commercial purposes. If you need more computing resources for other applications, you can improve the hardware setup (computer or camera) or just use lidar data without the camera. I completely understand your concerns. Balancing performance and resources has always been a tricky challenge for us.

FurkanEdizkan commented 1 month ago

Just wanted to thank, I managed to make it work and forgot to thank you for answering my quesitons. I have been traning models for a time, results are really good so far. I will be informing about process.

Su-study commented 1 month ago

Congratulations on your success! I'm also very interested in how to use the camera, can you share your solution? Looking forward to your reply!

Just wanted to thank, I managed to make it work and forgot to thank you for answering my quesitons. I have been traning models for a time, results are really good so far. I will be informing about process.