Closed bamford closed 3 years ago
I don't really want you to worry about changing the hyperparameters much at this point, but as a demo, this is the result of training with 5 latent dimensions, with beta=0.1, for 60 epochs. This takes about 50 mins on my laptop. Final loss: 26.0654 - reconstruction_loss: 24.7810 - kl_loss: 1.2835 - val_loss: 27.6137 - val_reconstruction_loss: 26.3616 - val_kl_loss: 1.2735
This is the simplest way I could come up with for implementing a CVAE with different input and 'target' images.
I'm really not a fan of the current keras example, it seems to add unnecessary complexity.
It does ok, but note that a small CVAE isn't particularly well-suited to producing good reconstructions of MNIST digits. You can get improvements by increasing the number of latent dimensions and/or lowering
beta
(which reduces the 'smoothness' enforced on the latent space). It is also important to let it train for long enough (until the validation loss stops improving). This can take a while, especially on a CPU.