ajbrock / BigGAN-PyTorch

The author's officially unofficial PyTorch BigGAN implementation.
MIT License
2.86k stars 476 forks source link

Why channel drop? #20

Closed apple2373 closed 5 years ago

apple2373 commented 5 years ago

I was checking other repository (before checking this one), and then found a strange channel drop trick. https://github.com/huggingface/pytorch-pretrained-BigGAN/issues/9

I can see you also use it here: https://github.com/ajbrock/BigGAN-PyTorch/blob/ba3d05754120e9d3b68313ec7b0f9833fc5ee8bc/BigGANdeep.py#L54-L56

Could you explain why do you do this? I think it's strange to train with lager channels more than necessary and drop at inference time. Does this trick somehow help for training?

ajbrock commented 5 years ago

The channel drops in G's blocks are part of the BigGAN-deep architecture, as described in the paper. The channel drop at the output layer you see in Thom's TFHub port (from 128->3) is an implementation detail for taking advantage of TPU accelerators.