Open sun2009ban opened 5 years ago
I am having the same issue. Any fixes for this?
The model collapse very quickly with exploded loss. I want to figure out why.
It seems nothing to do with initialization methods.
I found it! the adam parameter beta_2 should never be 0.01 :)
I found it! the adam parameter beta_2 should never be 0.01 :)
Still collapse after few epochs :(
Seems like collapse to one mode after 2/3,000 iterations, is here anybody can give a reason?
Seems like collapse to one mode after 2/3,000 iterations, is here anybody can give a reason?
The norm of h, i.e., f(x) gets bigger and bigger as training evolves, leading to the torch.exp(h) term approaches to Infinity. I looked into the model parameters and found the norm of them also increased a lot in training. One way to mitigate the issue is by using L1 regularization. It would also help if you use F.softplus to make the logistic prior calculation stable.
Thank you for the code! When I ran the code on mnist like this
python train.py --dataset mnist
, I got the outputValidation Loss Statistics: min=nan, med=nan, mean=nan, max=nan
after training for several epochs.Please help me, I have no idea what the matter is.