jhaoshao / ChronoDepth

ChronoDepth: Learning Temporally Consistent Video Depth from Video Diffusion Priors
MIT License
194 stars 4 forks source link

ChronoDepth: Learning Temporally Consistent Video Depth from Video Diffusion Priors

This repository represents the official implementation of the paper titled "Learning Temporally Consistent Video Depth from Video Diffusion Priors".

Website Paper Hugging Face Space Hugging Face Model

Jiahao Shao*, Yuanbo Yang*, Hongyu Zhou, Youmin Zhang, Yujun Shen, Matteo Poggi, Yiyi Liao†

πŸ“’ News

2024-06-11: Added - try it out with your videos for free!
2024-06-11: Added paper and inference code (this repository).

πŸ› οΈ Setup

We test our codes under the following environment: Ubuntu 20.04, Python 3.10.14, CUDA 11.3, RTX A6000.

  1. Clone this repository.
    git clone https://github.com/jhaoshao/ChronoDepth
    cd ChronoDepth
  2. Install packages
    conda create -n chronodepth python=3.10
    conda activate chronodepth
    pip install -r requirements.txt

πŸš€ Run inference

Run the python script run_infer.py as follows

python run_infer.py \
    --model_base=jhshao/ChronoDepth \
    --data_dir=assets/sora_e2.mp4 \
    --output_dir=./outputs \
    --num_frames=10 \
    --denoise_steps=10 \
    --window_size=9 \
    --half_precision \
    --seed=1234 \

Inference settings:

πŸŽ“ Citation

Please cite our paper if you find this repository useful:

@misc{shao2024learning,
      title={Learning Temporally Consistent Video Depth from Video Diffusion Priors}, 
      author={Jiahao Shao and Yuanbo Yang and Hongyu Zhou and Youmin Zhang and Yujun Shen and Matteo Poggi and Yiyi Liao},
      year={2024},
      eprint={2406.01493},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}