[Paper]
This is the official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution, arXiv". This repository contains codes, colab, video demos of our work.
Authors: Kelvin C.K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Nanyang Technological University
Acknowedgement: Our work is built upon MMEditing. The code will also appear in MMEditing soon. Please follow and star this repository and MMEditing!
Feel free to ask questions. I am currently working on some other stuff but will try my best to reply. If you are also interested in BasicVSR++, which is also accepted to CVPR 2022, please don't hesitate to star!
The videos have been compressed. Therefore, the results are inferior to that of the actual outputs.
https://user-images.githubusercontent.com/7676947/143370499-9fe4069b-46cc-4f12-b6ff-5595e8e5e0b8.mp4
https://user-images.githubusercontent.com/7676947/143370350-91f751f3-0f33-4ee4-9b1a-b9279bf41c18.mp4
https://user-images.githubusercontent.com/7676947/143370556-9e7019d4-e718-46af-859f-54d5576cd370.mp4
https://user-images.githubusercontent.com/7676947/143370859-e0293b97-f962-476f-acf8-14fad27cea77.mp4
Install PyTorch and torchvision following the official instructions, e.g.,
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
Install mim and mmcv-full
pip install openmim
mim install mmcv-full
Install mmedit
pip install mmedit
Download the pre-trained weights to checkpoints/
. (Dropbox / Google Drive / OneDrive)
Run the following command:
python inference_realbasicvsr.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${INPUT_DIR} ${OUTPUT_DIR} --max-seq-len=${MAX_SEQ_LEN} --is_save_as_png=${IS_SAVE_AS_PNG} --fps=${FPS}
This script supports both images and videos as inputs and outputs. You can simply change ${INPUT_DIR} and ${OUTPUT_DIR} to the paths corresponding to the video files, if you want to use videos as inputs and outputs. But note that saving to videos may induce additional compression, which reduces output quality.
For example:
Images as inputs and outputs
python inference_realbasicvsr.py configs/realbasicvsr_x4.py checkpoints/RealBasicVSR_x4.pth data/demo_000 results/demo_000
Video as input and output
python inference_realbasicvsr.py configs/realbasicvsr_x4.py checkpoints/RealBasicVSR_x4.pth data/demo_001.mp4 results/demo_001.mp4 --fps=12.5
We crop the REDS dataset into sub-images for faster I/O. Please follow the instructions below:
Put the original REDS dataset in ./data
Run the following command:
python crop_sub_images.py --data-root ./data/REDS --scales 4
The training is divided into two stages:
Train a model without perceptual loss and adversarial loss using realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py.
mim train mmedit configs/realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py --gpus 8 --launcher pytorch
Finetune the model with perceptual loss and adversarial loss using realbasicvsr_c64b20_1x30x8_lr5e-5_150k_reds.py. (You may want to replace load_from
in the configuration file with your checkpoints pre-trained at the first stage
mim train mmedit configs/realbasicvsr_c64b20_1x30x8_lr5e-5_150k_reds.py --gpus 8 --launcher pytorch
Note: We use UDM10 with bicubic downsampling for validation. You can download it from here.
Assuming you have created two sets of images (e.g. input vs output), you can use generate_video_demo.py
to generate a video demo. Note that the two sets of images must be of the same resolution. An example has been provided in the code.
You can download the dataset using Dropbox / Google Drive / OneDrive.
@inproceedings{chan2022investigating,
author = {Chan, Kelvin C.K. and Zhou, Shangchen and Xu, Xiangyu and Loy, Chen Change},
title = {Investigating Tradeoffs in Real-World Video Super-Resolution},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2022}
}