christopher-beckham / hologan-pytorch

Non-official + minimal reimplementation of HoloGAN by Nguyen-Phuoc, et al: https://arxiv.org/abs/1904.01326
BSD 3-Clause "New" or "Revised" License
65 stars 7 forks source link

About Angle loss #10

Closed junhahyung closed 3 years ago

junhahyung commented 3 years ago

Hi, I've noticed that this implementation has angle loss,

g_t_loss = torch.mean((g_t_pred-angles_t)**2)

which is not mentioned in the original paper nor implemented in original tensorflow code.

Is this loss necessary for training? Thanks:)

christopher-beckham commented 3 years ago

Hi,

I believe it is necessary, and even if the model works without it, it is good to have it in general. The reason is because whenever you're training a generator which takes as input more than one noise variable (i.e. z and theta), it is possible in the optimisation for the generator to ignore either one of the noise variables. You want to preserve the mutual information between (z, theta) and G(z, theta), which amounts to having a discriminator predict the original values. (Also see InfoGAN.)

junhahyung commented 3 years ago

Thanks!! I also believe that it is necessary, just wonder how the original tensorflow code worked..

christopher-beckham commented 3 years ago

No problem. I haven't run the original TensorFlow code myself (I don't use or write TF), and I know it appears to have a lot manually-written code for doing the grid resampling and interpolation, so there are all sorts of confounders I may not be aware of (hence why I added a disclaimer in the readme).