Open zrx0311 opened 3 years ago
2*2080ti, batchsize=8; I've used the BN layer, but after a few epochs loss still becomes NaN
Maybe the batch size is too small.
thank!Now i have 4 Tesla.And would you get a better result if you didn't use BN?
I haven't tried. But BN may enable more stable training or improve the generalization ability.
2*2080ti, batchsize=8; I've used the BN layer, but after a few epochs loss still becomes NaN