Open uuutty opened 5 years ago
It's an optimization:
log(a) = log(1 / n_bins) = -log(n_bins)
Just to clarify, the purpose of the constant "scaling penalty" c
is just to ensure accurate likelihood computations? Since the minimum would be the same with or without c
. Comparison or model selection on the basis of likelihood computation is also iffy though isn't it?
Giving that a normalizing flow gives you a correct log-likelihood of your data under your model it would be a shame to omit c
even though technically not required for optimization. Model scoring/selection can be done using the log-likelihood of test data under the model. Superiority of a model can for example be proven with a likelihood-ratio test.
Thank you for the explanation!
In the paper, the objective function to minimize is However in the code, objective first add this constant c, logpz, and then apply a negative sign to the objective to get generate loss https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L172 https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L181 https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L184 It seems to minimize -logpx+Mlog(a), not the loss writed in paper which is -logpx-Mlog(a) Do you ignore the constant because it will not affect the training or I missed something in the code?