mulns / PerVFI

Official code base of "Perception-Oriented Video Frame Interpolation via Asymmetric Blending" (CVPR 2024), also denoted as ''PerVFI''.
https://openaccess.thecvf.com/content/CVPR2024/papers/Wu_Perception-Oriented_Video_Frame_Interpolation_via_Asymmetric_Blending_CVPR_2024_paper.pdf
Apache License 2.0
35 stars 2 forks source link
normalizing-flow video video-frame-interpolation

Perception-Oriented Video Frame Interpolation via Asymmetric Blending :link:

Guangyang Wu, Xin Tao, Changlin Li, Wenyi Wang, Xiaohong Liu, Qingqing Zheng

In CVPR 2024

This repository represents the official implementation of the paper titled "Perception-Oriented Video Frame Interpolation via Asymmetric Blending", also denoted as "PerVFI".

Website Paper Hugging Face Model License

We present PerVFI, a novel paradigm for perception-oriented video frame interpolation.

teaser teaser

๐Ÿ“ข News

2024-6-13: Paper Accepted! . Release the inference code (this repository).

2024-6-1: Added arXiv version: .

โˆž TODO

๐Ÿš€ Usage

We offer several ways to interact with PerVFI:

  1. Run the demo locally (requires a GPU and Anaconda, see Installation Guide). Local development instructions with this codebase are given below.
  2. Extended demo on Google Colab (coming soon).
  3. Online interactive demo (coming soon).

๐Ÿ› ๏ธ Setup

The inference code was tested on:

๐Ÿชง A Note for Windows users

We recommend running the code in WSL2:

  1. Install WSL following installation guide.
  2. Install CUDA support for WSL following installation guide.
  3. Find your drives in /mnt/<drive letter>/; check WSL FAQ for more details. Navigate to the working directory of choice.

๐Ÿ“ฆ Repository

Clone the repository (requires git):

git clone https://github.com/mulns/PerVFI.git
cd PerVFI

๐Ÿ’ป Dependencies

We provide several ways to install the dependencies.

  1. Using Conda.

    Windows users: Install the Linux version into the WSL.

    After the installation, create the environment and install dependencies into it:

    conda env create -f environment.yaml
    conda activate pervfi
  2. Using pip: Alternatively, create a Python native virtual environment and install dependencies into it:

    python -m venv venv/pervfi
    source venv/pervfi/bin/activate
    pip install -r requirements.txt

Keep the environment activated before running the inference script. Activate the environment again after restarting the terminal session.

๐Ÿƒ Testing on your video

๐Ÿ“ท Prepare video sequences

Place your video images in a directory, for example, under input/in-the-wild_example, and run the following inference command.

โฌ‡ Download Checkpoints

Download pre-trained models and place them to folder checkpoints. This includes checkpoints for various optical flow estimators. You can choose one for simple use or all for comparison.

๐Ÿš€ Run inference

The Default checkpoint is trained only using Vimeo90K dataset.

 python infer_video.py -m [OFE]+pervfi -data input -fps [OUT_FPS]

NOTE: OFE is a placeholder for optical flow estimator name. In this repo, we support [RAFT](), [GMA](), [GMFlow](). You can also use your preferred flow estimator (future feature). OUT_FPS is a placeholder for frame rate (default to 10) of output video (maybe save with images).

The Vb checkpoint (faster) replaces the normalizing flow-generator with a multi-scale decoder to achieve faster inference speed, though with a compromise in perceptual quality:

 python infer_video.py -m [OFE]+pervfi-vb -data input -fps [OUT_FPS]

You can find all results in output. Enjoy!

๐Ÿฆฟ Evaluation on test datasets

Will be included in VFI-Benchmark (currently under crafting).

๐Ÿ‹๏ธ Training

Comming Soon~

โœ๏ธ Contributing

Please refer to this instruction.

๐ŸŽ“ Citation

Please cite our paper:

@InProceedings{Wu_2024_CVPR,
    author    = {Wu, Guangyang and Tao, Xin and Li, Changlin and Wang, Wenyi and Liu, Xiaohong and Zheng, Qingqing},
    title     = {Perception-Oriented Video Frame Interpolation via Asymmetric Blending},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {2753-2762}
}

๐ŸŽซ License

This work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).

By downloading and using the code and model you agree to the terms in the LICENSE.

License