This repository represents the official implementation of the paper titled "Learning Temporally Consistent Video Depth from Video Diffusion Priors".
Jiahao Shao*, Yuanbo Yang*, Hongyu Zhou, Youmin Zhang, Yujun Shen, Matteo Poggi, Yiyi Liaoβ
2024-06-11: Added - try it out with your videos for free!
2024-06-11: Added paper and inference code (this repository).
We test our codes under the following environment: Ubuntu 20.04, Python 3.10.14, CUDA 11.3, RTX A6000
.
git clone https://github.com/jhaoshao/ChronoDepth
cd ChronoDepth
conda create -n chronodepth python=3.10
conda activate chronodepth
pip install -r requirements.txt
Run the python script run_infer.py
as follows
python run_infer.py \
--model_base=jhshao/ChronoDepth \
--data_dir=assets/sora_e2.mp4 \
--output_dir=./outputs \
--num_frames=10 \
--denoise_steps=10 \
--window_size=9 \
--half_precision \
--seed=1234 \
Inference settings:
--num_frames
: sets the number of frames for each video clip.--denoise_steps
: sets the number of steps for the denoising process.--window_size
: sets the size of sliding window. This implies conducting separate inference when the sliding window size equals the number of frames.--half_precision
: enables running with half-precision (16-bit float). It might lead to suboptimal result but could speed up the inference process.Please cite our paper if you find this repository useful:
@misc{shao2024learning,
title={Learning Temporally Consistent Video Depth from Video Diffusion Priors},
author={Jiahao Shao and Yuanbo Yang and Hongyu Zhou and Youmin Zhang and Yujun Shen and Matteo Poggi and Yiyi Liao},
year={2024},
eprint={2406.01493},
archivePrefix={arXiv},
primaryClass={cs.CV}
}