MIT-SPARK / Kimera-Semantics

Real-Time 3D Semantic Reconstruction from 2D data
BSD 2-Clause "Simplified" License
631 stars 134 forks source link

Issues while adding support for Intel RealSense T265 #50

Open bhchiang opened 3 years ago

bhchiang commented 3 years ago

I'm picking up ROS to write a launch files to run live metric reconstruction (no semantic annotation yet) via the Intel RealSense T265 (tracking provided by two grayscale fisheye cameras). Running into a few issues:

  1. I saw from previous issues that camera pose is provided via the sensor_frame argument according to the Voxblox node documentation. I recorded my current TF graph (published from the realsense-ros node) image.

Unfortunately, there is no frame called world being published, which is the default argument for world_frame - what should I set world_frame and sensor_frame to?

image

  1. Since the T265 doesn't provide depth information, I'm setting run_stereo_dense to true to obtain depth predictions via stereo_image_proc. Unfortunately, the node is throwing errors since it requires RGB stereo input, while the T265 only provides grayscale pairs. Do you know of any alternatives to this package?

Thanks!

FPSychotic commented 3 years ago

If you have enough computing power , maybe you can get depth with t265 as a stereo camera following the rtabmap metod of the link, a depth msg will be published. http://official-rtab-map-forum.67519.x6.nabble.com/Comparison-between-realsense-D435-vs-T265-vs-T265-D435-dual-setup-td6456.html

I ask my self if your errors with stereo_image_proc are because you didn't calibrate the camera. If you use t265 as an stereo camera, getting the depth from the L&R streamings, you must do it. Rtabmap will give you a easy way, but there is a few reliable ways . T265 streamings are RGB, (RGB8) I think stereo_image_proc should work with the calibration. Stereo depth from a t265 in a SBC or embebed computer will be a hard job, but probably you are using a desktop.

The ideal should be a d435 if you need a easy and performance friendly way. Im not a expert, so please take my comment as just a way more to explore. I guess myself what you mean with "live metric reconstruction" could you link for me a example, video, paper to it?

Ahh, world frame is basically related to a GPS frame, should be better use the map frame, which apply better for indoors map, were probably kimera is more used. Rtabmap will give you a map frame in the top of the frame tree, you can republish with the world name, with same TF 0,0,0 0,0,0