ming053l / DRCT

Accepted by New Trends in Image Restoration and Enhancement workshop (NTIRE), in conjunction with CVPR 2024.
MIT License
160 stars 13 forks source link

DRCT: Saving Image Super-resolution away from Information Bottleneck

✨✨ [CVPR NTIRE Oral Presentation]

PWC PWC PWC PWC

contributions welcome

[Paper Link] [Project Page] [Poster] [Model zoo] [Visual Results] [Slide] [Video]

Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou

Advanced Computer Vision LAB, National Cheng Kung University

Overview

In CNN-based super-resolution (SR) methods, dense connections are widely considered to be an effective way to preserve information and improve performance. (introduced by RDN / RRDB in ESRGAN...etc.)

However, SwinIR-based methods, such as HAT, CAT, DAT, etc., generally use Channel Attention Block or design novel and sophisticated Shift-Window Attention Mechanism to improve SR performance. These works ignore the information bottleneck that information flow will be lost deep in the network.

Our work simply adds dense connections in SwinIR to improve performance and re-emphasizes the importance of dense connections in Swin-IR-based SR methods. Adding dense-connection within deep-feature extraction can stablize information flow, thereby boosting performance and keeping lightweight design (compared to the SOTA methods like HAT).

Benchmark results on SRx4 without x2 pretraining. Mulit-Adds are calculated for a 64x64 input. Model Params Multi-Adds Forward FLOPs Set5 Set14 BSD100 Urban100 Manga109 Training Log
HAT 20.77M 11.22G 2053M 42.18G 33.04 29.23 28.00 27.97 32.48 -
DRCT 14.13M 5.92G 1857M 7.92G 33.11 29.35 28.18 28.06 32.59 -
HAT-L 40.84M 76.69G 5165M 79.60G 33.30 29.47 28.09 28.60 33.09 -
DRCT-L 27.58M 9.20G 4278M 11.07G 33.37 29.54 28.16 28.70 33.14 -
DRCT-XL (pretrained on ImageNet) - - - - 32.97 / 0.91 29.08 / 0.80 - - - log

Real DRCT GAN SRx4. (Coming Soon)

Model Training Data Checkpoint Log
Real-DRCT-GAN_MSE_Model DF2K + OST300 Checkpoint Log
Real-DRCT-GAN_Finetuned from MSE DF2K + OST300 Checkpoint Log

Updates

[Training log on ImageNet] [Pretrained Weight (without fine-tuning on DF2K)]

Environment

python inference.py --input_dir [input_dir ] --output_dir [input_dir ]  --model_path[model_path]

How To Test

Note that the tile mode is also provided for limited GPU memory when testing. You can modify the specific settings of the tile mode in your custom testing option by referring to ./options/test/DRCT_tile_example.yml.

How To Train

The training logs and weights will be saved in the ./experiments folder.

Citations

If our work is helpful to your reaearch, please kindly cite our work. Thank!

BibTeX

@misc{hsu2024drct,
  title={DRCT: Saving Image Super-resolution away from Information Bottleneck}, 
  author = {Hsu, Chih-Chung and Lee, Chia-Ming and Chou, Yi-Shiuan},
  year={2024},
  eprint={2404.00722},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}
@InProceedings{Hsu_2024_CVPR,
  author    = {Hsu, Chih-Chung and Lee, Chia-Ming and Chou, Yi-Shiuan},
  title     = {DRCT: Saving Image Super-Resolution Away from Information Bottleneck},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
  month     = {June},
  year      = {2024},
  pages     = {6133-6142}
}

Thanks

A part of our work has been facilitated by HAT, SwinIR, LAM framework, and we are grateful for their outstanding contributions.

A part of our work are contributed by @zelenooki87, thanks for your big contributions and suggestions!

Contact

If you have any question, please email zuw408421476@gmail.com to discuss with the author.