shangwei5 / ST-AVSR

Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors (ECCV2024)
MIT License
126 stars 2 forks source link

Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors (ECCV2024)


[arXiv]

This repository is the official PyTorch implementation of ST-AVSR: Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors.

Introduction

IMG_00001

Our method can achieve video SR with an arbitrary scale. If global SR is required, simply input the size of the target, which corresponds to hr_coord in the code. And the SR scale is related to cell in the code. If local super-resolution is required, hr_coord needs to be cropped.

Examples of the Demo

https://github.com/shangwei5/ST-AVSR/assets/43960503/3a8dd3c0-21fd-499c-8ccb-4362c6c5dcb0

https://github.com/shangwei5/ST-AVSR/assets/43960503/42babacd-1b23-480b-9984-c205c62f2b6d

https://github.com/shangwei5/ST-AVSR/assets/43960503/d18fd854-fee3-41f7-9c0a-de416ee49c8b

Prerequisites

Datasets

Please download the RS-GOPRO datasets from REDS (Type: Sharp) and Vid4.

Dataset Organization Form

|--REDS
    |--train
        |--train_sharp  
            |--video 1
                |--frame 1
                |--frame 2
                    :
            |--video 2
                :
            |--video n
    |--val
        |--val_sharp
            |--video 1
                |--frame 1
                |--frame 2
                    :
            |--video 2
             :
            |--video n
|--Vid4
    |--video 1
        |--frame 1
        |--frame 2
            :
    |--video 2
        :
    |--video n

Download Pre-trained Model

Please download the pre-trained model from BaiduDisk(password:47q3) or GoogleDrive. Please put the models to ./. Our results on REDS and Vid4 can also be downloaded from BaiduDisk(password:rkf7) and BaiduDisk(password:6gv9).

Getting Started

1) Testing

  1. Processing the entire video frames:

    bash test_sequence.sh

    Please change --data_path according to yours.

  2. Processing frame by frame :

    bash test.sh

    Please change --data_path according to yours.

  3. Processing other datasets with no GT:

    python test_seq_yours.py --data_path  /your/data/path/  --model_path   /your/model/path/  --result_path  /your/result/path/   --space_scale "4,4"  --max_patch  256

    space_scale currently only supports integers, and there may be some issues with non-integers. max_patch represents the size of the crop patch, which can be reduced if GPU memory is still insufficient.

2) Training

  1. Training ST-AVSR from scratch.

We use an NVIDIA RTX A6000 (48GB) for training. Please adjust the batch_size and test{'n_seq'} in options based on your GPU memory.

python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 train.py --opt options/train_refsrrnn_cuf_siren_adists_only_future_t2.json --dist True

Please change gpu_ids, path{'root', 'images'}, and data_root in options according to yours.

  1. Train B-AVSR first, then fine-tune ST-AVSR based on B-AVSR.

We use an NVIDIA RTX A6000 (48GB) for training. Please adjust the batch_size and test{'n_seq'} in options based on your GPU memory.

python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 train.py --opt options/train_refsrrnn_cuf_siren.json --dist True

Please change gpu_ids, path{'root', 'images'}, and data_root in options according to yours.

python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 train.py --opt options/train_refsrrnn_cuf_siren_adists_only_future_t2.json --dist True

Please change gpu_ids, path{'root', 'images', 'pretrained_netG'}, and data_root in options according to yours.

Note: If ‘out of memory’ occurs during the validation, please adjust the appropriate sequence length test{'n_seq'}. The code of validation is implemented by processing the entire video sequence.

Cite

If you use any part of our code, or ST-AVSR is useful for your research, please consider citing:

@article{shang2024arbitrary,
  title={Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors},
  author={Shang, Wei and Ren, Dongwei and Zhang, Wanying and Fang, Yuming and Zuo, Wangmeng and Ma, Kede},
  journal={arXiv preprint arXiv:2407.09919},
  year={2024}
}

Contact

If you have any questions, please contact csweishang@gmail.com.