Author: Zhixiang Min
Contact: zmin1@stevens.edu
VOLDOR-SLAM is a real-time dense-indirect SLAM system takes dense optical flows as input that supports monocular, stereo and RGB-D video sequence. The system logic is written in native python, which is friendly to modify and get started.
VOLDOR-SLAM: For the times when feature-based or direct methods are not good enough
Zhixiang Min, Enrique Dunn
ICRA 2021 [paper] [video]VOLDOR: Visual Odometry from Log-logistic Dense Optical flow Residuals
Zhixiang Min, Yiding Yang, Enrique Dunn
CVPR 2020 [paper] [video]
Our system is built with cuda, cython and python. We will support the compatibility under the following configurations:
Windows 10 + Visual Studio 2017 / Ubuntu 18.04
CUDA >= 9.0
Python 3.6.X
OpenCV 3.4.X
Ceres 2.0 (Optional for mapping)
pyDBoW3 (Optional for loop closure)
cd slam_py/install
pip install -r .\requirements.txt
demo
folder. You may also need install VC++ runtime library. OpenCV (Required)
opencv_worldxxx.dll
from the opencv workspace to the demo
folder. PyOpenGL (Optional for viewer)
dll
file to be configured in PATH
. If you meet OpenGL.error.NullFunctionError
while launching the viewer, try install PyOpenGL from wheels PyOpenGL-wheels. Ceres (Optional for mapping)
ceres.dll
, glog.dll
, libblas.dll
, libgcc_s_sjlj-1.dll
, libgfortran-3.dll
, liblapack.dll
, libquadmath-0.dll
to the demo
folder. They are all ceres dependencies. You can find them under Ceres and SuiteSparse workspace. pyDBoW3 (Optional for loop closure. Prerequisite building full SLAM)
pyDBoW3.pyd
when you succeeded. Copy it to the demo
folder. ORBvoc.bin
already included in the demo
folder. Build VO Only:
slam_py/setup_win_vo.py
. You will need to provide/change opencv_include_dir
, opencv_lib_dir
and opencv_lib_name
according to your opencv installation. cd slam_py/install
python setup_win_vo.py build_ext -i
gpu-kernels.dll
and pyvoldor_vo.xxx.pyd
will appear in the install
folder. Copy them to the demo
folder. Build Full SLAM Pipeline:
slam_py/setup_win_full.py
. You will need to provide/change opencv_include_dir
, opencv_lib_dir
, opencv_lib_name
, ceres_include_dirs
and ceres_lib_dirs
according to your opencv and ceres installations.cd slam_py/install
python setup_win_full.py build_ext -i
gpu-kernels.dll
and pyvoldor_full.xxx.pyd
will appear in the install
folder. Copy them to the demo
folder.OpenCV (Required)
sudo apt install libopencv-dev
, be aware of its version that we only tested under 3.4.X, though we found most versions work fine. sudo make install
. PyOpenGL (Optional for viewer)
sudo apt install python-opengl
Ceres (Optional for mapping)
sudo make install
. pyDBoW3 (Optional for loop closure. Prerequisite building full SLAM)
pyDBoW3.so
when you succeed. Copy it to the demo
folder.ORBvoc.bin
already included in the demo
folder. Build VO Only:
cd slam_py/install
python setup_linux_vo.py build_ext -i
libgpu-kernels.so
and pyvoldor_vo.xxx.so
will appear in the install
folder. Copy them to the demo
folder. Build Full SLAM Pipeline:
cd slam_py/install
python setup_linux_full.py build_ext -i
libgpu-kernels.so
and pyvoldor_full.xxx.so
will appear in the install
folder. Copy them to the demo
folder. Our method takes three inputs. You may download demo data to run and play.
Optical flows are REQUIRED for visual odometry. We recommend MaskFlowNet or PWC-Net as the optical flow estimator.
Supported format:.flo
Disparity maps are OPTIONAL input only present when stereo pairs or depth sensors are available. For stereo pairs, we recommend using the same optical flow estimator running from left to right. For depth sensors, you need to set a virtual baseline value to convert the depth to disparity as input.
Supported format:.flo
,.png (x256 gray-16bit)
RGB images are OPTIONAL input for adding photo-consistency to enhance the frame alignment.
Supported format:all image formats supported by opencv
cd demo
python demo.py --help
python demo.py
--fx 320 --fy 320 --cx 320 --cy 240 --bf 80
--flow_dir 'path_to_flow'
--img_dir 'path_to_img'
--disp_dir 'path_to_disp'
--mode stereo
--enable_mapping
--enable_loop_closure './ORBvoc.bin'
--save_poses './poses.txt'
--save_depths './depths'
More details can be found in the comments of the demo
script. In demo data, we also provided bash commands in run_demo.txt
under each data instance folder.
Viewer Control Manual
> q -> Exit > m -> Save point cloud to .ply > r -> Reset view > h -> Hide cameras / links > f -> Follow the current camera > w/s -> Increase/Decrease scene pixel size > d/a -> Increase/Decrease scene pixel density > x/z -> Increase/Decrease scene pixel depth range > Mouse scroll up/down -> Zoom in/out > Mouse drag with left down -> Scene rotation > Mouse drag with right down -> Scene translation
Viewer Range. When running with monocular capture, if the world scale is unfortunately initialized too large, the viewer may not display many pixels, which can be adjusted using x/z keys.
How to pick basefocal. basefocal
refers to baseline (meter)
times focal (px)
. It decides the scene scale. If the input disparity maps are from rectified stereo, the baseline
of stereo pairs should be used. In the case of depth camera or monocular, basefocal
can be used as a hyper-parameter tuning the confidence on the input depth. A proper basefocal
value should let the displayed keyframe depth map (tmpkf_depth
) shows a porper variance (neither pure white or black).
Input Frame Rate. Our framework favors a modest baseline between frames to let the optical flows being informative. For high frame rate videos, consider downsampling the frame rate based on appearance changing. (E.g. KITTI/TartanAir @ 10Hz, TUM-RGBD @ 3Hz.)
I/O BottleNeck. If you observe choppy GPU load or the latest map on the viewer colored gray, it probably means the disk I/O is bottlenecking the performance. This will affect the accuracy by blocking the local mapping while VO keeps running. Consider move your data to SSD disk or you may restart the software a few times that your disk will cache more data.
Parameter Tuning. Usually default parameters work for all. You may try loosing the loop closure thresholds to detect more loop closures. For very challenging dataset, if the full pipeline is not stable, try disable local mapping. For tuning VO parameters, pass parameters using slam.voldor_user_config
. All available parameters can be found in voldor/config.h
with descriptions. For SLAM parameters, check the __init__
method of VOLDOR_SLAM
.
What is mono-scaled mode? Mono-scaled mode is to use the disparity map only for correcting the world scale.
The software is released under the attached LICENSE
file.
Parts of the software rely on the following open source projects:
LambdaTwist, OpenCV, pyDBoW3, DBoW3, Ceres, SuiteSparse, Eigen, svd3, PyOpenGL