Open HarikrishnanBalagopal opened 4 years ago
I think the solution is to set bias=False
on these lines:
https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/networks/custom_layers.py#L98-L100
and
https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/networks/custom_layers.py#L120-L121
Then add a bias separately after the x = self.module(x)
and x *= self.weight
lines https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/networks/custom_layers.py#L72-L74
Hello, Sorry for the delay. You're right I indeed missed that part. I'll see I have some tine to retrain the models with this modification.
I don't think there's a bug. The objective of weight scaling is to control the gradients. In this implementation, when activations are scaled instead of the weights, the impact on weight gradients is the same because weights are multiplied with activations. However, since biases are added, the multiplier for activations does not affect the gradient. That is my understanding of it.
If you let bias = False then the module no longer contains bias. @HarikrishnanBalagopal. @altairmn can you explain your comment? In my mind there's definitely a difference.
https://github.com/facebookresearch/pytorch_GAN_zoo/blob/7275ecbf53a9db7e4bc38c4c5136c10c4950724b/models/networks/custom_layers.py#L72-L74
The above implementation applies the weight scaling to the bias tensor as well. However in the original implementation (https://github.com/tkarras/progressive_growing_of_gans/blob/master/networks.py#L53-L59) weight scaling is NOT applied to bias tensor.
This makes sense since He normal initialization takes into account fan-in and fan-out which depends on the dimensionality of the weights, not the biases. https://medium.com/@prateekvishnu/xavier-and-he-normal-he-et-al-initialization-8e3d7a087528