openai / glow

Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
MIT License
3.11k stars 516 forks source link

Implementation vs paper #34

Closed victorherbemontagne closed 6 years ago

victorherbemontagne commented 6 years ago


First of all, thanks for this amazing work, it has been a pleasure to dive in the paper !

Now more precisely, when looking at the implementation and the implementation of the actnorm module, I can't understand the choice made considering the paper. In the paper you state that you used a affine transformation of the activation with parameters s and b. image

But in the implementation it seems to first add the bias b: x = x + b(with actnorm_center), then multiply by s: x = s * (x+b) (with actnorm_scale)

You reverse the code when reverse = True but I feel this might be the opposite.

I surely miss something as you manage to train the model but I am curious about this choice.

Do I miss something?

Thank you in advance for your help !


prafullasd commented 6 years ago

Thanks Victor! :) You're right, we first centre the activations to 0 mean, then scale them to unit variance, we'll update the paper accordingly. In terms of learning though the formulations are equivalent, since x = s * (x + b) = s * x + (s * b) = s * x + b', where b' is the new bias. Note the loss term sum(log(|s|)) prevents values in s from becoming 0.