pyro-ppl / pyro

Deep universal probabilistic programming with Python and PyTorch
http://pyro.ai
Apache License 2.0
8.49k stars 982 forks source link

Binarized MNIST in VAE tutorial #529

Closed tristandeleu closed 6 years ago

tristandeleu commented 6 years ago

The model in the VAE tutorial has a Bernoulli observation model. However, the images from MNIST (after preprocessing) have values in [0, 1], which are not valid observations under this model. I think one has to binarize these images before feeding them to the model/guide, either by sampling the pixels or with a threshold.

karalets commented 6 years ago

We are aware of this and report this explicitly, this was chosen for convenience to avoid arbitrary binarizations following established methodology ranging back to rbm-literature. The model can still train on mnist and basically works, but the likelihoods in the non-binarized case are of course slightly inflated.

For a paper you would download a binarized version and report the resulting slightly worse likelihoods, but the model would behave the same as here.

You could also fix this if you want to be precise by sampling from dist.bernoulli with the loaded images as parameters to the distribution and using the samples as data with the given model.

Just be aware to only do this once, as repeated sampling of the data provides unfair regularization to the model and also inflates likelihood scores.

On Tue, Nov 7, 2017, 4:20 PM Tristan Deleu notifications@github.com wrote:

The model in the VAE tutorial http://pyro.ai/examples/vae.html has a Bernoulli observation model. However, the images from MNIST (after preprocessing) have values in [0, 1], which are not valid observations under this model. I think one has to binarize these images before feeding them to the model/guide, either by sampling https://github.com/blei-lab/edward/blob/081ea532a982e6d2c88da25d6e2527f6a66f09ab/examples/vae.py#L38 the pixels or with a threshold https://github.com/altosaar/variational-autoencoder/blob/1944d3a2eca4730339519cae557533f482237be1/vae.py#L169 .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uber/pyro/issues/529, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVhL7uv6gxwZ26JU61V_nZvQlLajJ_Iks5s0PPUgaJpZM4QVqS3 .

ngoodman commented 6 years ago

but why do we do this slightly hacky thing? eg why not just use a continuous observation model?

jmaronas commented 6 years ago

In my experience for mnist to work with a VAE the best thing is to use as sample the expected value of the decoder distribution. In case of using a bernouilli decoder is make sense as the binary cross entropy you minimize is given by:

xln p + (1-x)ln(1-p)

where x represent the pixel value and p the predicted mean of the bernouilli distribution and you are trying to match this value to the observed pixel. In this case our distribution takes the form:

p(x|z)=p^x*(1-p)^(1-x) and in my opinion that is why it seems reasonable to sample the expected value, because it is not a proper defined bernouilli distribution.

Check bishop neural networks book section 6.7

Hope it helps¡

franciscocms commented 12 months ago

Sorry for opening this again, but I am stuck with this and still couldn't find a solution anywhere. For other image datasets where it doesn't make sense to binarize images, wouldn't a continuous distribution do a better job evaluating likelihoods of generated images?

martinjankowiak commented 12 months ago

@franciscocms this choice isn't particular to pyro in anyway. for better or for worse this is common practice in generative modeling. if you want to dive deeper i such reading relevant literature like this one

franciscocms commented 12 months ago

@martinjankowiak I will, thanks for the recommendation!