Open dhruvrajan opened 5 years ago
When training MNIST for > 14 epochs, there seems to be an underflow problem, resulting in NaNs when the loss becomes ~ 7E-3.
When training MNIST for > 14 epochs, there seems to be an underflow problem, resulting in NaNs when the loss becomes ~ 7E-3.