phillipi / pix2pix

Image-to-image translation with conditional adversarial nets
https://phillipi.github.io/pix2pix/
Other
10.15k stars 1.71k forks source link

Issue about PatchGAN #203

Open thomascong121 opened 3 years ago

thomascong121 commented 3 years ago

Hi:

I am wondering why it is sufficient to restrict attention to the structure in local image patches to model high-level features?

phillipi commented 3 years ago

This is an interesting question. One thing to recognize first is that the generator has receptive fields that cover the entire image, and that's what allows the generator to model high-level features. The PatchGAN discriminator can indeed get away with relatively small receptive fields, and ultimately that's related to natural image statistics, the level of stochasticity in the mapping you are learning, and the spatial density of information in the input image to the generator.

lizijue commented 2 years ago

Hello, I wonder why it is available for discriminator to just force high-frequency correctness by restricting its receptive field in local image patches?

It is my understanding that it change the whole task of determining a picture is true or false into many sub-tasks of determining a image patch is true or false, which greatly reduces the complexity of the task, and it ultimately help to improve the performance, but nothing to do with the high-frequency information.

Could you please explain it for me? Thanks a lot!