Open apple2373 opened 5 years ago
They are dropped. This is done several times in the model actually, also here: https://github.com/huggingface/pytorch-pretrained-BigGAN/blob/1e18aed2dff75db51428f13b940c38b923eb4a3d/pytorch_pretrained_biggan/model.py#L192-L194
If you read the latest version of the BigGAN paper, you will see it part of the changes in the new "deep" versions of BigGAN.
Thanks for the reply! I think I am confused. If you will simply drop channels, why don't you use that smaller channels in the training time? I mean, in the last layer, for example, why don't you use just nn.Conv2d(128,3)
instead of training nn.Conv2d(128,128)
and dropping 125 channels in the inference time?
Could you point to specific page and line where authors are explaining this part? I tried to find it in the 1809.11096v2, but I could not find it. The table 9.a just says BN, ReLU, 3 × 3 Conv ch → 3
.
I still can't understand why this repository use strange channel dropping trick. Is it this repository owner's invented trick that training larger channel and dropping at inference time ?
I checked the BigGAN author's implementation but he does not seem to use channel dropping... https://github.com/ajbrock/BigGAN-PyTorch/blob/ba3d05754120e9d3b68313ec7b0f9833fc5ee8bc/BigGANdeep.py#L68-L93
Well I'm not very familiar with Andy's implementation but I see a channel dropping part here: https://github.com/ajbrock/BigGAN-PyTorch/blob/ba3d05754120e9d3b68313ec7b0f9833fc5ee8bc/BigGANdeep.py#L54-L56
I'm not sure Andy's implementation can load the --deep
models, which is what the present repo is based on (see https://github.com/ajbrock/BigGAN-PyTorch/issues/10).
Maybe you would be better off asking in the issues of https://github.com/ajbrock/BigGAN-PyTorch ?
Oh, I missed that part. If the original one uses the channel drop, it makes sense to use it here. Thanks! I'll ask authors directly.
The original author answered. It's because tensorflow will be faster when the number of input and output channels are the same. I think it's okay to delete the unused channels from this repository as it just wastes the computational resources when it comes to pytorch.
https://github.com/huggingface/pytorch-pretrained-BigGAN/blob/1e18aed2dff75db51428f13b940c38b923eb4a3d/pytorch_pretrained_biggan/model.py#L245-L246
I'm trying to understand the model by reading code. I noticed that
conv_to_rgb
has actually 128 channels but only first three are used for the final RGB image. Why do you do this? What the other 125 channels for?