About log likelihood of data point (logpX)

Meng-Wei commented 4 years ago

Hi, there! I am trying to calculate the log-likelihood of data points.

To do so, I directly modified the provided loss function:

Then, I tested it on the cifar-10 dataset, and got the following:

This result is similar to the one in the paper: https://arxiv.org/abs/1810.09136 "DO DEEP GENERATIVE MODELS KNOW WHAT THEY DON’T KNOW", so I assumed the "logpX" function is correct.

[Problem: ] The question is, when I sampled from cifar-10, it seems that the sampled images have overall higher log-likelihood than the real images: (std here has unit 0.1, i.e. 9 implies 0.9. sry for ambiguity)

I am not sure why this happens. Is this a bug, or it is desired? Thank you in advance!

Meng-Wei commented 4 years ago

Moreover, is "logpX" - "logpZ" = "logpDet"? And should "logpDet" be the same (or almost the same) for different datapoints? Thank you

ikrets commented 4 years ago

I am not an author of Glow, but my understanding is that you are sampling with temperature, using density p(x)^(1/T^2) instead of p(x). The std that you mention is presumably this T parameter. The effect of sampling with temperature is that higher likelihood samples are favored. I think you will get the desired histogram with std=1.

As for the second question, the formula is true, and logpDet shouldn't be equal for different datapoints. I know from doing an exercise with flows on 2D data, that there it was different by a large margin for the dataset I had. I don't know whether it is the same for high-dimensional datasets, but my guess would be yes.

openai / glow

About log likelihood of data point (logpX) #89