Validation Loss Statistics: min=nan, med=nan, mean=nan, max=nan - Githubissues

paultsw / nice_pytorch

Nonlinear Independent Components Estimation (Dinh et al, 2014) in PyTorch.

BSD 3-Clause "New" or "Revised" License

115 stars 24 forks source link

Validation Loss Statistics: min=nan, med=nan, mean=nan, max=nan #2

Open sun2009ban opened 5 years ago

sun2009ban commented 5 years ago

Thank you for the code! When I ran the code on mnist like this python train.py --dataset mnist, I got the output Validation Loss Statistics: min=nan, med=nan, mean=nan, max=nan after training for several epochs. default

Please help me, I have no idea what the matter is.

phongnhhn92 commented 5 years ago

I am having the same issue. Any fixes for this?

ranery commented 5 years ago

The model collapse very quickly with exploded loss. I want to figure out why.

ranery commented 5 years ago

It seems nothing to do with initialization methods.

ranery commented 5 years ago

I found it! the adam parameter beta_2 should never be 0.01 :)

ranery commented 5 years ago

I found it! the adam parameter beta_2 should never be 0.01 :)

Still collapse after few epochs :(

ranery commented 5 years ago

Seems like collapse to one mode after 2/3,000 iterations, is here anybody can give a reason?

leviszhang commented 4 years ago

Seems like collapse to one mode after 2/3,000 iterations, is here anybody can give a reason?

The norm of h, i.e., f(x) gets bigger and bigger as training evolves, leading to the torch.exp(h) term approaches to Infinity. I looked into the model parameters and found the norm of them also increased a lot in training. One way to mitigate the issue is by using L1 regularization. It would also help if you use F.softplus to make the logistic prior calculation stable.