Open imihassan opened 3 years ago
I presume the running loss is increasing then during each epoch? That is the optimizer is diverging rather than converging?
I noticed myself that training the AE was by no means guaranteed to work. One thing I did was to train it on a small subset of images first, in a sense making the AE overfitted. The AE I obtained from that was later used as starting configuration for the full dataset. The optimization then converged since even the overfitted AE had converged on some common features.
Your observation is a bit odd, since you say the optimizer managed to create decent parameters at first. You can try things like (1) further reducing the learning rate, (2) setting gradients to zero (or not), (3) try different batch sizes, or (4) use smaller dataset to start with as I did.
Optimization is an art, and I am not sure what will work for you. The AE model contains batch normalizations, still, the gradients are evidently imperfect for the optimizers. The authors of the SegNet model (from which I adopt the particular AE architecture) do not report such issues, but who knows if they might still have had some?
Hopefully some of these tricks can help.
Did any of my comments help? The AE optimization is tricky since the starting parameters are random.
Hi, I used ae_deep autoencoder to train it for my custom dataset i.e. plant dataset (contains plant leaves, weeds, and soil+ground). I then start it performed well. After the first epoch, it generated something that can be near original but a little fuzzy. So I continued the training but after the second epoch the result started getting bad and till the 10th epoch, it was all gray images.
Can you suggest to me the changes I can do that can be helpful to get a better-reconstructed image? I am using learning rate = 1e-3 and adam optimizer with weight decay =1e-5.