Question about output size of D network

jychoi118 commented 4 years ago

Output of original StyleGAN's discriminator is a scalar, predicting whether the given image is real or fake. However, the output shape of your D network is batch x (2 dlatent_size) in the line below. https://github.com/podgorskiy/ALAE/blob/5d8362f3ce468ece4d59982ff531d1b8a19e792d/net.py#L893 Therefore, you selected one element among 2dlatentsize elements as the final output of D network (which is used for loss function) in the line below (Z). https://github.com/podgorskiy/ALAE/blob/5d8362f3ce468ece4d59982ff531d1b8a19e792d/model.py#L111

I'm curious why the output shape of D network is batch x (2 * dlatent_size), since only one element is used for training and the others are useless.

Plus, I can't understand why the output of D network is reshaped like this. https://github.com/podgorskiy/ALAE/blob/5d8362f3ce468ece4d59982ff531d1b8a19e792d/net.py#L903

rardz commented 4 years ago

Seems like a hastily modified version from some of stylegan's modules.

6b5d commented 4 years ago

im confused, too.

since mapping_tl does not have the activation function, it seems to be the 3-layer multilinear map mentioned in the paper. the output of mapping_tl is supposed to be used to minimize the coding reconstruction error. but the output style of the encoder is used to minimize the coding reconstruction error while the output of mapping_tl is used to compute discriminator loss

i wonder if this is a bug or i just misunderstand something?

podgorskiy commented 3 years ago

@jychoi118 ,

I'm curious why the output shape of D network is batch x (2 * dlatent_size), since only one element is used for training and the others are useless.

Yes, just one is used. The others should not affect anything. That's is the result of trying many configurations, but now have to keep it that way to be compatible with the trained models. I'll adjust the code be a bit more clear here.

@6b5d ,

does not have the activation function

Yes, that's a bug

podgorskiy / ALAE

Question about output size of D network #38