Add different downsampling methods to PatchGAN discriminator

Project-MONAI / GenerativeModels

MONAI Generative Models makes it easy to train, evaluate, and deploy generative models and related applications

Apache License 2.0

555 stars 78 forks source link

Add different downsampling methods to PatchGAN discriminator #475

Closed StijnvWijn closed 2 months ago

StijnvWijn commented 3 months ago

Dear Monai Team,

I have been working on generating synthetic images using a SPADE GAN similar to this paper using the monai library and I noticed that your implementation of the PatchGAN discriminator differs slightly from theirs and the official implementation. The main difference is that instead of using a pooling kernel to change the receptive field, you seem to use an increasing amount of layers. I have done some experiments and for my use case, it seems that having a pooling operation improves my results. So my question is the following:

Do you want me to create a PR to add the option for including a pooling operation or was there a reason for the current implementation?

marksgraham commented 3 months ago

Hi,

I think your proposed change would be good, as long as default behaviour doesn't change with the update, please go ahead with the PR :)

Tagging @virginiafdez in case she has any comments too

virginiafdez commented 3 months ago

Hi! If I remember correctly, we changed the official implementation to make it compatible with other works that were part of the initial set of models that drove the start of this repo. These models used Patch-GAN, but not the pix2pixHD version, hence the difference. However, as long as the defaults don't change, it is great to have alternatives, especially if there's evidence that a certain combination of parameters leads to better results. Thanks!

StijnvWijn commented 3 months ago

Implemented in pull request #479 I was just unsure about whether to make the kernel size for the pooling operations another parameter or whether to stick with the same kernel size as the convolutions. For now I did the latter, let me know if you think another approach is better.