This is the official PyTorch implementation of the paper Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution.
Compressed video super-resolution (VSR) aims to restore high-resolution frames from compressed low-resolution counterparts. Most recent VSR approaches often enhance an input frame by “borrowing” relevant textures from neighboring video frames. Although some progress has been made, there are grand challenges to effectively extract and transfer high-quality textures from compressed videos where most frames are usually highly degraded. we propose a novel Frequency-Transformer for compressed Video Super-Resolution (FTVSR) that conducts self-attention over a joint space-time-frequency domain. FTVSR significantly outperforms previous methods and achieves new SOTA results.
We propose transfering video frames into frequecy domain design a novel frequency attention mechanism. We study the different self-attention schemes among space, time and frequency dimensions. We propose a novel Frequency-Transformer for compressed Video Super-Resolution (FTVSR) that conducts self-attention over a joint space-time-frequency domain.
Some visual results on videos with different compression rates (No compression, CRF 15, 25, 35).
Pre-trained models can be downloaded from baidu cloud(i42r) or Google drive.
Training set
├────REDS
├────train
├────train_sharp
├────000
├────...
├────269
├────train_sharp_bicubic
├────X4
├────000
├────...
├────269
sep_trainlist.txt
file listing the training samples in the download zip file.
├────vimeo_septuplet
├────sequences
├────00001
├────...
├────00096
├────sequences_BD
├────00001
├────...
├────00096
├────sep_trainlist.txt
├────sep_testlist.txt
Testing set
git clone https://github.com/researchmm/FTVSR.git
cd FTVSR
./checkpoint
configs/FTVSR_reds4.py
and configs/FTVSR_vimeo90k.py
# REDS model
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./tools/dist_test.sh configs/FTVSR_reds4.py checkpoint/FTVSR_REDS.pth 8 [--save-path 'save_path']
# Vimeo model
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./tools/dist_test.sh configs/FTVSR_vimeo90k.py checkpoint/FTVSR_Vimeo90K.pth 8 [--save-path 'save_path']
save_path
.git clone https://github.com/researchmm/FTVSR.git
cd FTVSR
configs/FTVSR_reds4.py
and configs/FTVSR_vimeo90k.py
# REDS
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./tools/dist_train.sh configs/FTVSR_reds4.py 8
# Vimeo
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./tools/dist_train.sh configs/FTVSR_vimeo90k.py 8
We also sincerely recommend some other excellent works related to us. :sparkles:
If you find the code and pre-trained models useful for your research, please consider citing our paper. :blush:
@ARTICLE{10239462,
author={Qiu, Zhongwei and Yang, Huan and Fu, Jianlong and Liu, Daochang and Xu, Chang and Fu, Dongmei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Learning Degradation-Robust Spatiotemporal Frequency-Transformer for Video Super-Resolution},
year={2023},
volume={45},
number={12},
pages={14888-14904},
doi={10.1109/TPAMI.2023.3312166}}
@InProceedings{qiu2022learning,
author = {Qiu, Zhongwei and Yang, Huan and Fu, Jianlong and Fu, Dongmei},
title = {Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution},
booktitle = {ECCV},
year = {2022},
}
This code is built on mmediting. We thank the authors of BasicVSR for sharing their code.