musyoku / chainer-glow

Glow: Generative Flow with Invertible 1×1 Convolutions
77 stars 12 forks source link

About prior "z" negative log-likelihood #6

Open Meng-Wei opened 5 years ago

Meng-Wei commented 5 years ago

Hi, there! Thanks for your awesome work!

I am trying to get some stats (logpX, logpZ) from the model. And I used your pre-trained model on celebA-64x64 images. When handling logpZ, I saw you first calculate negative log-likelihood of "z" in different multi-scales separately, and then sum them up. However, when I concatenated z's into a single array, the log-likelihood I got is larger (~1.5x) than sum-up nll.

Here are the stats I got: mean var -0.0574313 0.311581 # level: 6x32x32 0.0713234 0.5110019 # level: 12x16x16 0.0486291 0.750326 # level: 24x8x8 0.0024840 0.994663 # level: 48x4x4 sum-up nll: 9259.02

concatenate: mean: -0.022121632; var: 0.6087184; nll: 14386.038

Do you have any ideas why the variance in deeper levels is higher than shallower ones? Or the way I concatenate z's is wrong? I think different variances are the main reason the concatenated nll is higher than sum-up nll.

Thank you in advance!

Meng-Wei commented 5 years ago

After comparing your codes with OpenAI/glow, I found one possible explanation. Correct me if I am wrong:

Both of you compute negative log-likelihood of "zi in different levels" separately, and then sum them up:

  1. OpenAI/glow does this in the "splitting function". They directly add nll of zi in different levels to the "objective".
  2. You explicitly compute nll in different levels in the loss function.

But OpenAI/glow does one more step: in the loss function, they calculate the nll of concatenated "zi", which is the prior distribution.

Therefore, this leads to different distributions of "Z" after certain iterations.