Style_loss is normalized by batch_size twice

xunhuang1995 / AdaIN-style

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

https://arxiv.org/abs/1703.06868

MIT License

1.47k stars 192 forks source link

Style_loss is normalized by batch_size twice #30

Closed Mona77 closed 6 years ago

Mona77 commented 6 years ago

Hi, Thanks for sharing your code publicly. In your implementation style_loss is summation of mean_square_error of mean and std, this line.

self.mean_loss = self.mean_criterion:forward(self.input_mean, self.target_mean)

But may I ask why you have normalized this value by batch size?

self.mean_loss = self.mean_loss / N -- normalized w.r.t. batch size

Wouldn't MSE criteria take average over N and C?

xunhuang1995 commented 6 years ago

Hi, the mean_criterion has sizeAverage set to False, so it's not normalized by batch size at the first place.