Open nathanin opened 7 years ago
You are correct: logsd should be connected to encode3neuron, not encode3. It's a typo. I doubt it will make very much difference in behavior (the VAE works just fine with this minor bug), and unfortunately it may be a while before I have time to check that fixing this doesn't break things.
In terms of your difficulties in training, I suspect that the switch from ReLU to sigmoid is far more important. I've heard many people in the vision community attribute the success of AlexNet to the fact that it swapped the sigmoids of previous networks for ReLU's, because ReLU's do a better job ensuring that gradients don't vanish. Making sure that gradients don't vanish or explode is a problem in every deep net, and VAEs are no exception. Check your initialization, and consider using batch norm.
Hi thanks for the great tutorial. I have trouble understanding math. What is the reason to pass in
encode3
tologsd
before the nonlinearity is applied? Why not giveencode3neur
to bothmu
andlogsd
? I would ask if it's a typo, but running the reference prototxt, I can make it converge.I have combined the VAE layers with convolution and deconvolution layers, and am having trouble training MNIST with this new architecture. (Using Sigmoid neurons instead of ReLU, if that matters).