Why doing criterion.sizeAverage = true makes everything not work?

y0ast / VAE-Torch

Implementation of Variational Auto-Encoder in Torch7

MIT License

267 stars 60 forks source link

Why doing criterion.sizeAverage = true makes everything not work? #11

Closed Amir-Arsalan closed 8 years ago

Amir-Arsalan commented 8 years ago

After doing criterion.sizeAverage = true I realized the KLD criterion gives an output of ~0 after epoch 2-3 constantly and the reconstructions are identical and do not make sense at all. I tried with very very small learning rates and also bigger learning rates and I still face the same issue. Why is that?

y0ast commented 8 years ago

I recommend you check what criterion.sizeAverage does in the original Torch7 code. From that you can infer why the reconstructions are identical. Note that these github issues are for technical problems, not for personal help.

Amir-Arsalan commented 8 years ago

@y0ast Sorry maybe I did not ask my question very well. I knew what the sizeAverage does, but what I wanted to know was why averaging the pixel-wise errors hinders learning?

y0ast commented 8 years ago

You scale the reconstruction term of the objective down massively, which means the KLD term will overwhelm the objective. This leads to the network mainly optimizing the KLD and it going to zero, the reconstruction will be bad.