ZYangChen / MoCha-Stereo

[CVPR2024] The official implementation of "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching”.
MIT License
118 stars 3 forks source link
cvpr2024 stereo-matching

MoCha-Stereo 抹茶算法

[CVPR2024] The official implementation of "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching".

https://github.com/ZYangChen/MoCha-Stereo/assets/108012397/2ed414fe-d182-499b-895c-b5375ef51425

V1 Version

     

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching
Ziyang Chen†, Wei Long†, He Yao†, Yongjun Zhang✱,Bingshu Wang, Yongbin Qin, Jia Wu
CVPR 2024
Correspondence: ziyangchen2000@gmail.com; zyj6667@126.com✱

@inproceedings{chen2024mocha,
  title={MoCha-Stereo: Motif Channel Attention Network for Stereo Matching},
  author={Chen, Ziyang and Long, Wei and Yao, He and Zhang, Yongjun and Wang, Bingshu and Qin, Yongbin and Wu, Jia},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={27768--27777},
  year={2024}
}

Requirements

Python = 3.8

CUDA = 11.3

conda create -n mocha python=3.8
conda activate mocha
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113

The following libraries are also required

tqdm
tensorboard
opt_einsum
einops
scipy
imageio
opencv-python-headless
scikit-image
timm == 0.6.5
six

Dataset

To evaluate/train RAFT-stereo, you will need to download the required datasets.

By default stereo_datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── FlyingThings3D
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Monkaa
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Driving
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── KITTI_2015
            ├── testing
            ├── training
        ├── KITTI_2012
            ├── testing
            ├── training
    ├── Middlebury
        ├── MiddEval3
    ├── ETH3D
        ├── two_view_training
        ├── two_view_training_gt
        ├── two_view_testing

Training

python train_stereo.py --batch_size 8 --mixed_precision

Evaluation

To evaluate a trained model on a validation set (e.g. Middlebury full resolution), run

python evaluate_stereo.py --restore_ckpt models/mocha-stereo.pth --dataset middlebury_F

Weight is available here.

FAQ

Q1. Weight for "tf_efficientnetv2_l"?

A1: Please refer to issue #6 "关于tf_efficientnetv2_l检查点的问题", #8 "预训练权重", and #9 "code error".

Todo List

Acknowledgements