angli66 / simsense

A Real-Time Depth Sensor Simulator with GPU Acceleration
MIT License
19 stars 3 forks source link

This library has been integrated into SAPIEN. SAPIEN is a state-of-the-art interactive simulation environment. Powered with SimSense and high-speed ray tracing engine, SAPIEN can simulate realistic depth in real time. Check https://github.com/haosulab/simsense for more details.

SimSense: A Real-Time Depth Sensor Simulator

This is a GPU-accelerated depth sensor simulator for python. The goal of this project is to provide a real-time, adjustable, and off-the-shelf module that can simulate different kinds of real-world depth sensors. Building a reliable and fast depth sensor simulator is vital to closing the sim-to-real gap.

Based on stereo matching, SimSense encapsulates most of the common algorithms in real depth sensors. All you need is passing in left and right images used for stereo matching, and SimSense will directly output a computed depth map that almost looks like generated by a real depth sensor. Check test/test1.py and test/test2.py for examples of usage.

Depth by RealSense D435 (Invalid Marked as Black) Depth by SimSense (Invalid Marked as Dark Blue)

Performance

Experiment settings:

64 Disparity Level 128 Disparity Level 256 Disparity Level
w/o Block Cost 231.1 FPS 148.7 FPS 84.4 FPS
w/ Block Cost 195.7 FPS 124.1 FPS 69.4 FPS

Requirements

CUDA Toolkit needs to be installed to compile .cu source files. Check https://developer.nvidia.com/cuda-downloads for instructions.

Install

Run

git clone git@github.com:angli66/simsense.git
cd simsense
git submodule update --init

to clone the repository and get the 3rd party library.

Before installing, check your NVIDIA GPU's compute capability at https://developer.nvidia.com/cuda-gpus. The default settings supports compute capability of 6.0, 6.1, 7.0, 7.5, 8.0 and 8.6. If yours is not listed above, add the value into line 19 of CMakeLists.txt.

Then, run

pip install .

to build and install the package.

Pipeline

pipeline

Infrared Noise Simulation

This part is optional. Applying infrared noise simulation to infrared input can sometimes enhance the sim-to-real performance. The noise model is derived from Simulating Kinect Infrared and Depth Images by Landau et al.

Rectification

For image pair that is not rectified, rectification needs to be performed before stereo matching to align the image planes of the two images.

Center-Symmetric Census Transform

CSCT is a common pre-processing step performed in many real depth sensors before stereo matching. It transforms the pixels to contain information from their neighbors.

Stereo Matching

The algorithm used here is semi-global block matching, which is a combination of semi-global matching and block matching algorithms. Semi-global matching is a robust matching algorithm that can produce smooth disparity maps. The semi-global block matching algorithm implemented in this module differs from the original H. Hirschmuller algorithm as follows:

Uniqueness Test

Uniqueness test filters out matches by checking its "uniqueness". For a pixel in the left image and its best match in the right image, uniqueness test will find the cost of second best match (excluding the left and right adjacent pixels of the best match), and compare it to the cost of the best one. If the best match's cost does not win the second best match's cost by a certain margin, the match will be marked as invalid.

Left-Right Consistency Check

Left-Right consistency check is another common post-processing step for computing disparity. It filters out matches that are inconsistent between the computed left disparity map and right disparity map.

Median Filter

A median filter is usually applied to smooth the computed disparity map.

Disparity to Depth Conversion

Optional depth registration can be performed at this step, which will reproject the computed depth map from the left camera's frame to a specified RGB camera's frame. This allows easy integration with RGB data to get RGBD image. If RGB camera's frame is not specified, the output depth map will be in left camera's frame with the same size of the input images by default.

Known Issues

Acknowledgement

Some acceleration techniques of the library are inspired by Embedded real-time stereo estimation via Semi-Global Matching on the GPU by D. Hernandez-Juarez et al.