kaichen-z / Manydepth2

Manydepth2: Motion-Aware Self-Supervised Monocular Depth Estimation in Dynamic Scenes
https://arxiv.org/pdf/2312.15268
103 stars 18 forks source link
depth-estimation-in-dynamic-scenes self-supervised-monocular-depth-estimation

Manydepth2: Motion-Aware Self-Supervised Monocular Depth Estimation in Dynamic Scenes

[Link to paper]

This is the official implementation of Manydepth2.

We introduce Manydepth2, a Motion-Guided Depth Estimation Network, to achieve precise monocular self-supervised depth estimation for both dynamic objects and static backgrounds

PWC

PWC

PWC

Overview

Manydepth constructs the cost volume using both the reference and target frames, but it overlooks the dynamic foreground, which can lead to significant errors when handling dynamic objects:

Qualitative comparison on Cityscapes

This phenomenon arises from the fact that real optical flow consists of both static and dynamic components. To construct an accurate cost volume for depth estimation, it is essential to extract the static flow. The entire pipeline of our approach can be summarized as follows:

Structure

In our paper, we:

Our contributions enable accurate depth estimation on both the KITTI and Cityscapes datasets:

Predicted Depth Maps on KITTI:

Kitti

Error Map for Predicted Depth Maps on Cityscapes:

Cityscape

✏️ 📄 Citation

If you find our work useful or interesting, please cite our paper:

Manydepth2: Motion-Aware Self-Supervised Monocular Depth Estimation in Dynamic Scenes

Kaichen Zhou, Jia-Wang Bian, Qian Xie, Jian-Qing Zheng, Niki Trigoni, Andrew Markham 

📈 Results

Our Manydepth2 method outperforms all previous methods in all subsections across most metrics with the same input size, whether or not the baselines use multiple frames at test time. See our paper for full details.

KITTI results table

💾 Dataset Preparation

For instructions on downloading the KITTI dataset, see Monodepth2.

Make sure you have first run export_gt_depth.py to extract ground truth files.

You can also download it from this link KITTI_GT, and place it under splits/eigen/.

For instructions on downloading the Cityscapes dataset, see SfMLearner.

👀 Reproducing Paper Results

Prerequisite

To replicate the results from our paper, please first create and activate the provided environment:

conda env create -f environment.yaml

Once all packages have been installed successfully, please execute the following command:

conda activate manydepth2

Next, please download and install the pretrained FlowNet weights using this Weights For GMFLOW. And place it under /pretrained.

Training (W Optical Flow)

After finishing the dataset and environment preparation, you can train Manydehtp2, by running:

sh train_many2.sh

To reproduce the results on Cityscapes, we froze the teacher model at the 5th epoch and set the height to 192 and width to 512.

Training (W/O Optical Flow)

To train Manydepth2-NF, please run:

sh train_many2-NF.sh

Testing

To evaluate a model on KITTI, run:

sh eval_many2.sh

To evaluate a model (W/O Optical Flow)

sh eval_many2-NF.sh

👀 Reproducing Baseline Results

Running Manydepth:

sh train_many.sh

To evaluate Manydepth on KITTI, run:

sh eval_many.sh

💾 Pretrained Weights

You can download the weights for several pretrained models here and save them in the following directory:

--logs
  --models_many
    --...
  --models_many2
    --...
  --models_many2-NF
    --...

🖼 Acknowledgement

Great Thank to GMFlow, SfMLearner, Monodepth2 and Manydepth.