yinzhicun / MetaF2N

MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces (ICCV 2023)
55 stars 0 forks source link

MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces

Abstract:
Due to their highly structured characteristics, faces are easier to recover than natural scenes for blind image super-resolution. Therefore, we can extract the degradation representation of an image from the low-quality and recovered face pairs. Using the degradation representation, realistic low-quality images can then be synthesized to fine-tune the super-resolution model for the real-world low-quality image. However, such a procedure is time-consuming and laborious, and the gaps between recovered faces and the ground-truths further increase the optimization uncertainty. To facilitate efficient model adaptation towards image-specific degradations, we propose a method dubbed MetaF2N, which leverages the contained Faces to fine-tune model parameters for adapting to the whole Natural image in a Meta-learning framework. The degradation extraction and low-quality image synthesis steps are thus circumvented in our MetaF2N, and it requires only one fine-tuning step to get decent performance. Considering the gaps between the recovered faces and ground-truths, we further deploy a MaskNet for adaptively predicting loss weights at different positions to reduce the impact of low-confidence areas. To evaluate our proposed MetaF2N, we have collected a real-world low-quality dataset with one or multiple faces in each image, and our MetaF2N achieves superior performance on both synthetic and real-world datasets. method

Getting Started

Environment Setup

git clone https://github.com/yinzhicun/MetaF2N.git
cd MetaF2N
conda create -n metaf2n python=3.9
conda activate metaf2n
pip install -r requirements.txt

Pretrained Models

We provide the pretrained checkpoints in BaiduDisk and Google Drive. One can download them and save to the directory ./pretrained_models.

Moreover, the other models (pretrained GPEN, Real-ESRGAN and RetinaFace) we need for training and testing are also provided in BaiduDisk and Google Drive. One can download them and save to the directory ./weights.

Preparing Dataset

Testing

To test the method, you can run,

CUDA_VISIBLE_DEVICES=0 test.py --input_dir input_dir --output_dir output_dir --face_dir face_dir --patch_size patch_size --patch_num_per_img patch_num_per_img --fine_tune_num fine_tune_num

Four parameters can be changed for flexible usage:

--input_dir # test LQ image path
--output_dir # save the results path
--face_dir # the path that contains Face_LQ (Cropped LQ Face Area) and Face_HQ (Retored HQ Face Area)
--patch_size # the patch size for the cropping of HQ face area
--patch_num_per_img # the number of patches for the cropping of HQ face area
--fine_tune_num # the steps of the inner loop fine-tuning

Training

To train MetaF2N, you can adjust the parameters in config.py and run,

python main.py --trial trial --step step --gpu gpu_id

Calculate Metrics

To calculate metrics of the results, you can run,

python calculate_metrics.py --result_dir result_dir --gt_dir gt_dir --fid_ref_dir fid_ref_dir

Citation

@article{yin2023metaf2n,
  title={MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces},
  author={Yin, Zhicun and Liu, Ming and Li, Xiaoming and Yang, Hui and Xiao, Longan and Zuo, Wangmeng},
  journal={arXiv preprint arXiv:2309.08113},
  year={2023}
}

Acknowledgements

This code is built on MZSR, GPEN and LPIPS-Tensorflow. We thank the authors for sharing the codes.

Statement for RF200 Dataset

The images of RF200 dataset are collected from the Internet and existed datasets. However, the individual images were published online by their respective authors. We do not have the authorization of these images and some of them require giving appropriate credit to the original author, as well as indicating any changes that were made to the images. Moreover, the authorizations of these images can also change in the future. Therefore, we only provide a txt file RF200.txt to show the source and the link of every image.