AiArt-Gao / MATEBIT

[CVPR'23] Masked and Adaptive Transformer for Exemplar Based Image Translation (MATEBIT)
MIT License
89 stars 5 forks source link
computing-art correspondence-learning image-generation image-to-image-translation transformer

[CVPR2023] Masked and Adaptive Transformer for Exemplar Based Image Translation (MATEBIT)

Project arXiv CVPR Page Views Count

Abstract

We present a novel framework for exemplar based image translation. Recent advanced methods for this task mainly focus on establishing cross-domain semantic correspondence, which sequentially dominates image generation in the manner of local style control. Unfortunately, cross-domain semantic matching is challenging; and matching errors ultimately degrade the quality of generated images. To overcome this challenge, we improve the accuracy of matching on the one hand, and diminish the role of matching in image generation on the other hand. To achieve the former, we propose a masked and adaptive transformer (MAT) for learning accurate cross-domain correspondence, and executing context-aware feature augmentation. To achieve the latter, we use source features of the input and global style codes of the exemplar, as supplementary information, for decoding an image. Besides, we devise a novel contrastive style learning method, for acquire quality-discriminative style representations, which in turn benefit high-quality image generation.

Paper Information

Chang Jiang, **Fei Gao*, Biao Ma, Yuhao Lin, Nannan Wang, Gang Xu, "Masked and Adaptive Transformer for Exemplar Based Image Translation*," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition* (CVPR), 2023, pp. 22418-22427. Project arXiv CVPR

Sample Results

localFace1

localFace2

We offer more results here: Google Drive

Prerequisites

Getting Started

Preparation

Pretrained Models

Train/Test

1) Celeba(edge-to-face)

2) DeepFashion (pose-to-image)

3) Other datasets

Citation

If you use this code for your research, please cite our paper.

@inproceedings{jiang2023masked,
  title={Masked and Adaptive Transformer for Exemplar Based Image Translation},
  author={Jiang, Chang and Gao, Fei and Ma, Biao and Lin, Yuhao and Wang, Nannan and Xu, Gang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={22418--22427},
  year={2023}
}

Acknowledgments

This code borrows heavily from DynaST and MMTN. We also thank the implementation of Synchronized Batch Normalization.