Would it be possible to choose Upsample and Downsample strategies?

kwotsin / mimicry

[CVPR 2020 Workshop] A PyTorch GAN library that reproduces research results for popular GANs.

MIT License

602 stars 62 forks source link

Would it be possible to choose Upsample and Downsample strategies? #13

Open shimopino opened 4 years ago

shimopino commented 4 years ago

Looking at the official BigGAN implementation in Tensorflow, I found they use ConvTranspose2d for Upsample and Conv2d for Downsample in the ResNet block (e.g. https://github.com/taki0112/BigGAN-Tensorflow/blob/master/ops.py#L159).

I know that BigGAN implementations in PyTorch use a combination of Pooling and Conv (e.q. https://github.com/ajbrock/BigGAN-PyTorch/blob/master/BigGAN.py#L341), but in my experience, I can't say for sure which is the better performance.

In the future, is it possible to flexibly select an operation to change the resolution of the input feature map?

kwotsin commented 4 years ago

@KeisukeShimokawa This is an interesting suggestion: indeed, I'm also not sure what the performance difference between the two is. I did a deeper check and it seems like the TensorFlow version for BigGAN also uses a combination of Pool + Conv: https://github.com/google/compare_gan/blob/master/compare_gan/architectures/resnet_ops.py#L131 Perhaps this resblock structure is unique to BigGAN rather than the version from Miyato et al, but I might be wrong. Nonetheless, I think this is a very good point (and detail) to note and will certainly keep your suggestion in mind!

shimopino commented 4 years ago

@kwotsin Thank you for your reply. I hadn't checked that repository. Thank you for sharing.

Reading the original BigGAN paper again (arxiv), I found that the following diagram was provided and that a combination of Pooling and Conv was employed.

I also explored nvidia's repository on SPADE and found that it uses a combination of Pooling and Conv for ResBlock as well (e.g. https://github.com/NVlabs/SPADE/blob/master/models/networks/discriminator.py#L46).

The official implementation of Tensorflow that I have shown as a reference may have been a bit of a special implementation.