The generative loss in implementation

uuutty commented 5 years ago

In the paper, the objective function to minimize is However in the code, objective first add this constant c, logpz, and then apply a negative sign to the objective to get generate loss https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L172 https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L181 https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L184 It seems to minimize -logpx+Mlog(a), not the loss writed in paper which is -logpx-Mlog(a) Do you ignore the constant because it will not affect the training or I missed something in the code?

gerbenvv commented 5 years ago

It's an optimization:

log(a) = log(1 / n_bins) = -log(n_bins)

christabella commented 4 years ago

Just to clarify, the purpose of the constant "scaling penalty" c is just to ensure accurate likelihood computations? Since the minimum would be the same with or without c. Comparison or model selection on the basis of likelihood computation is also iffy though isn't it?

gerbenvv commented 4 years ago

Giving that a normalizing flow gives you a correct log-likelihood of your data under your model it would be a shame to omit c even though technically not required for optimization. Model scoring/selection can be done using the log-likelihood of test data under the model. Superiority of a model can for example be proven with a likelihood-ratio test.

christabella commented 4 years ago

Thank you for the explanation!

openai / glow

The generative loss in implementation #76