NVIDIA-ISAAC-ROS / isaac_ros_visual_slam

Visual SLAM/odometry package based on NVIDIA-accelerated cuVSLAM
https://developer.nvidia.com/isaac-ros-gems
Apache License 2.0
819 stars 127 forks source link

Support for RGB-D inputs instead of stereo pair #65

Closed grahamfletcher-ms closed 1 year ago

grahamfletcher-ms commented 1 year ago

Overview

It would be very helpful for our project if Elbrus could support RGB-D inputs in addition to stereo pairs. We would very much like to leverage this package for GPU-accelerated VO, but we depend on the IR emitter for our RealSense D455, rendering the IR stereo pair unusable. A slightly less desirable but simpler alternative would be for this package to support monocular VO/SLAM, which appears to be supported by Elbrus.

Use case details

Our solution leverages the RealSense D455 as our primary vision sensor. The RealSense ROS node publishes an RGB image, depth image, and point cloud. All of these topics are utilized by various other portions of our solution based on specific needs. Additionally, and critically, we need to enable the IR projector on the RealSense to improve fidelity of the disparity map for our application.

Our custom SLAM solution leverages factor graphs with GTSAM using a variety of standard and custom factor types, including wheel odometry, AprilTags, and floor lines. We do not currently support operation in non-structured environments, but this is a gap that Elbrus should hopefully be able to help fill. The VO output could trivially be added to the factor graph along with other existing factor types, improving overall robustness and expanding supported environments. We have verified that this indeed works in practice when disabling the IR emitter and passing in the IR stereo pair to this package's node.

Feature request

With the IR emitter enabled, the projected dot array shows up very clearly in both the left and right IR images. Since the projection moves with the camera, it can give the appearance of no motion between consecutive frames, rendering VO useless. (This is presumably why the option is turned off in the sample launch file.) There are several other potential solutions worth considering.

RGB-D inputs

Since we are currently unable to turn off the IR emitter, it would be preferable to have an alternate interface for Elbrus that would allow for providing an RGB image along with a depth map. In cases where an existing depth map is not available, Elbrus clearly has performance advantages for operating directly on distorted images with sparse feature points. However, if a depth map is already available, it would seem advantageous to use it. I don't know the implementation details for Elbrus, but I know this should be possible in a number of ways with varying levels of integration.

Mono VO/SLAM

As follow-up, when looking into the interfaces for the Elbrus library, I noted the following for ELBRUS_Track():

 * images - is a pointer to single image in case of mono or
 * array of two images in case of stereo.

In the shorter term, would it be possible to add an additional option to run this node in monocular mode with a single image topic? I've only seen reference to benchmarks performed with stereo cameras, so I'm not sure of the performance implications and extent of support for running mono.


Thank you for the support!

gordongrigor commented 1 year ago

VSLAM is designed to run on a left & right stereo image pair. A VSLAM function that relies on a depth and RGB image is completely separate algorithm, which we are not developing.

There is fortunately an easy quick solution to this for your application with the D455.

Assuming you are running 30fps on the D455, you can configure the imager to run 60fps, and alternate frames with IR emitter enabled, and disabled _(setting should be RS2_OPTION_EMITTER_ONOFF, but consult D455 documentation). IR frames are processed with your existing perception functions, and non-IR frames are fed into VSLAM.

You will need to write a splitter ROS node that selects which receives all frames from D455 and publishes IR and non-IR on separate image topics; or you could modify the VSLAM node to discard IR enable frames on the subscribed image topic from D455, and visa-versa for other perception graphs that need IR enabled.

Happy Holidays

hemalshahNV commented 1 year ago

Take a look at the Isaac ROS Nvblox+Isaac ROS Visual SLAM tutorial on RealSense (here) for how to use a splitter ROS node and time synchronizing the emitter to enable both depth and raw IR images.

grahamfletcher-ms commented 1 year ago

Thank you for quick response and helpful leads!

I was not aware of this configuration option for the RealSense, but I agree that toggling the emitter on/off in conjunction with the splitter makes total sense in this situation. Thanks for directing me to the sample—we'll give this a try some time after the new year.

Regarding my other question on monocular vSLAM: Is that something that's currently supported by Elbrus? It appears so based on the rig configuration and various related comments. Just would like to keep this in mind for potential future use.

Thanks again, and happy holidays!

gordongrigor commented 1 year ago

The VSLAM package does not support mono camera, as it is designed for stereo camera.