Open jychoi118 opened 4 years ago
Seems like a hastily modified version from some of stylegan's modules.
im confused, too.
since mapping_tl does not have the activation function, it seems to be the 3-layer multilinear map mentioned in the paper. the output of mapping_tl is supposed to be used to minimize the coding reconstruction error. but the output style of the encoder is used to minimize the coding reconstruction error while the output of mapping_tl is used to compute discriminator loss
i wonder if this is a bug or i just misunderstand something?
@jychoi118 ,
I'm curious why the output shape of D network is batch x (2 * dlatent_size), since only one element is used for training and the others are useless.
Yes, just one is used. The others should not affect anything. That's is the result of trying many configurations, but now have to keep it that way to be compatible with the trained models. I'll adjust the code be a bit more clear here.
@6b5d ,
does not have the activation function
Yes, that's a bug
Output of original StyleGAN's discriminator is a scalar, predicting whether the given image is real or fake. However, the output shape of your D network is batch x (2 dlatent_size) in the line below. https://github.com/podgorskiy/ALAE/blob/5d8362f3ce468ece4d59982ff531d1b8a19e792d/net.py#L893 Therefore, you selected one element among 2dlatentsize elements as the final output of D network (which is used for loss function) in the line below (Z). https://github.com/podgorskiy/ALAE/blob/5d8362f3ce468ece4d59982ff531d1b8a19e792d/model.py#L111
I'm curious why the output shape of D network is batch x (2 * dlatent_size), since only one element is used for training and the others are useless.
Plus, I can't understand why the output of D network is reshaped like this. https://github.com/podgorskiy/ALAE/blob/5d8362f3ce468ece4d59982ff531d1b8a19e792d/net.py#L903