Closed poolio closed 9 years ago
Hi Ben,
Thanks for reporting this. It seems that the default value for dataset
was changed from binarized
to real
in https://github.com/casperkaae/parmesan/commit/5638eea86e2fcf0aa5c094c07808ee3aa68b4d19. real
is supposed to bernoulli sample the dataset after each epoch but it dosesn't. I think that is a bug (https://github.com/casperkaae/parmesan/commit/657dd395d9a93f3cfd60c9d83416d5e04f470fd3#diff-6044ceb81b54b92878c23ddb97475be3 seems to have introduced that bug).
I think you can improve the results by changing dataset
to 'binarized' which will sample the MNIST dataset after each epoch. Alternatively you can sample the dataset once by calling bernoullisample
before training. I think resampling after each epoch gives you a few nats.
Casper, who wrote the example, is on holiday but he'll fix the code when he returns later this week. Until then i hope changing dataset
to binarized
improve the results.
@casperkaae: The comment in line https://github.com/casperkaae/parmesan/blob/master/examples/iw_vae.py#L107 seems to be wrong? + using real does not resample the dataset? Is this also a problem in the other example files?
@poolio
Thanks for reporting this. As @skaae points out there have been introduced a bug in the sampling procedure. Secondly I also used learning rate annealing to squeeze out the last percentages, however the LL_5000 should be at around ≈-86-85 after around 1000 epochs without annealing.
It should be fixed by #19 - however I have not tested the performance yet. It's will be much appreciated if you have time for running the example again and report the performance. I'll also test it as soon as possible.
I'll just run some tests to get it to work - will report back when it is done
I've updated the example code now in PR #20
after 650 epochs the LL_500 is -86.93621 which is pretty close to the results on the frontpage.
python iw_vae.py -nonlin_enc rectify -nonlin_dec very_leaky_rectify -batch_size 250 -eval_epoch 50
output:
Epoch=650 Time=3.58 LR=0.00100 E_qsamples=1 IVAEsamples=1 TRAIN: Cost=-91.62148 logq(z|x)=-114.92097 logp(z)=-141.96342 logp(x|z)=-64.57903 EVAL L1:Cost=-91.03194 logq(z|x)=-114.77290 logp(z)=-141.64285 logp(x|z)=-64.16198 EVAL-L5000: Cost=-86.93621 logq(z|x)=-114.76830 logp(z)=-141.62617 logp(x|z)=-64.23051
Running the example command from the README:
python examples/iw_vae.py -eq_samples 1 -iw_samples 1 -lr 0.001 -nhidden 500 -nlatent 100 -nonlin_dec very_leaky_rectify -nonlin_enc rectify
yields substantially worse performance than the plot:Can anyone replicate those results using the current codebase? Or has something changed that would result in such a large performance drop?