VAE: bad convergence of loss (attached tensorboard image)

blei-lab / edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

http://edwardlib.org

Other

4.83k stars 759 forks source link

VAE: bad convergence of loss (attached tensorboard image) #853

Open pauloabelha opened 6 years ago

pauloabelha commented 6 years ago

HI, I've been diving deeper into VAE and using Edward as my experimental toolbox.

I ran the exact same code as examples/vae.py, only altering one line to get tensorboard data: Line 77: inference.initialize(optimizer=optimizer, logdir="/tmp/log")

When I check my tensorboard I see a very bad convergence of loss function (please see tensorboard image attached for a complete run of the code).

Final printed lower bound: -log p(x) <= 176.285 ed_vae_tensorboard

Am I missing something?

Thank you,

dustinvtran commented 6 years ago

Nope. You are correct. Happy to take a contribution changing the number of iterations to one where it stabilizes.

pauloabelha commented 6 years ago

OK. Will work on it.

pauloabelha commented 6 years ago

To those also thinking of working on this,

I had to stop the running after two days. I need a better computer. I got to: -log p(x) <= 207.419 Epoch: 17768

See attached image for tensorboard. ed_vae_tensorboard_20k

dustinvtran commented 6 years ago

It seems unnecessary to be super-specific about the number of training iterations if it takes that long. Maybe try larger batch sizes (e.g., 256) and a non-trivial latent dimension (e.g., 50)?

pauloabelha commented 6 years ago

Yes, thank you. I’ll experiment. I just had to stop for a bit and decided to dump this result here for anyone interested in the issue.