Open Dan-RAI opened 7 years ago
I think we need regularization. Current results on MNIST (with the diagonal E-unit layer, normalized data):
25 hidden, 10 output, MSE, ~1000iter (batch size 1000):
Train 96% (taking only the first 10k pictures for training) Test 91%
Same setup, but 50H:
Train 98% (only 10k for training) Test 92%
I think we need regularization. Current results on MNIST (with the diagonal E-unit layer, normalized data):
25 hidden, 10 output, MSE, ~1000iter (batch size 1000):
Train 96% (taking only the first 10k pictures for training) Test 91%
Same setup, but 50H:
Train 98% (only 10k for training) Test 92%