Closed aman-tiwari closed 5 years ago
Aha... I don't know why I didn't notice that the ResUp layers had two inputs before. But yeah, I guess that it makes sense to use the second input for the embedded vector and for the AdaIN layers to apply its style to the image inside the ResDown blocks! Good point!!
For the rest, I took the general structure of the three networks from BigGAN, so they should already look quite similar.
Did it help with results?
From the paper, they allude to the architecture of the neural network being very similar to BigGAN. Would it be worth taking the placement of the self attention layers, resblock and normalization layers from it? (see pages 17-19: https://arxiv.org/pdf/1809.11096.pdf , the AdaIN layers could go where the batchnorm+linear is used)