XFeiF / ComputerVision_PaperNotes

📚 Paper Notes (Computer vision)
1 stars 0 forks source link

17CVPR| Image-to-Image Translation with Conditional Adversarial Networks #19

Closed XFeiF closed 3 years ago

XFeiF commented 4 years ago

[paper] && [code]
Authors:
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros
Berkeley AI Research (BAIR) Laboratory, UC Berkeley

Highlight

This paper considers the image-to image translation problem as the input and output differs in surface appearance but both are renderings of the same underlying structure. The results suggest that the conditional adversarial networks are a promising approach for many image-to-image translation tasks, especially those involving highly structured graphical outputs.

They use a "U-Net"-based architecture as the generator and for the discriminator they use a convolutional "PatchGAN" classifier, which only penalizes structure at the scale of patches. The discriminator tries to classify if each NxN patch in an image is real or fake.

The experiments cast on a variety of tasks and datasets, including:

XFeiF commented 4 years ago

Some notes

L1 and L2 loss produce blurry results on image reconstruction problems. [Autoencoding beyond pixels using a learned similarity metric.]

At inference time, they run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that they apply dropout at test time, and they apply batch normalization using the statistics of the test batch, rather than aggregated statistics of the training batch.