overfitting - BatchNormalization and Dropout

NeuraLegion / shainet

SHAInet - a pure Crystal machine learning library

MIT License

181 stars 19 forks source link

overfitting - BatchNormalization and Dropout #49

Closed drujensen closed 6 years ago

drujensen commented 6 years ago

I have been playing around with the titanic data and even submitted to Kaggle but my results are not so good. I'm ranking quite low (9511)

I believe my models are overfitting. I'm using sgdm and I have lowered the learning_rate and momentum to try and avoid this but the error and MSE start to rise and don't come back down.

Is the eraser layer similar to a Dropout layer? If so, how do I use it?

Is there a way to create a BatchNormalization layer?

bararchy commented 6 years ago

@drujensen 1) I would suggest adam instead of sgdm. 2) you can try mini batch training so that the net will train on different chunks which makes the over fitting less likely 3) Make sure you shuffle the test train array so if you have similar examples they are not close together .

@ArtLinkov any pointers ?

drujensen commented 6 years ago

@bararchy not sure why but adam always returns NaN as the results. Maybe I need to use adam with train_batch?

I am using shuffle and that seems to help. I am going to try the parallel eraser layer and see if that helps.

@ArtLinkov Is there an equivalent BatchNormalization?

bararchy commented 6 years ago

adam is best for batch training , we are working on the NaN issue

ArtLinkov commented 6 years ago

@drujensen do you still get the same problem? Sorry for not answering sooner, it's been a while since we worked on SHAInet (busy with the startup)

drujensen commented 6 years ago

We can close this. When I get some time, I will jump back on this and perform more testing.