Closed drujensen closed 6 years ago
@drujensen
1) I would suggest adam
instead of sgdm.
2) you can try mini batch
training so that the net will train on different chunks which makes the over fitting less likely
3) Make sure you shuffle
the test train array so if you have similar examples they are not close together .
@ArtLinkov any pointers ?
@bararchy not sure why but adam
always returns NaN
as the results. Maybe I need to use adam
with train_batch
?
I am using shuffle
and that seems to help. I am going to try the parallel eraser
layer and see if that helps.
@ArtLinkov Is there an equivalent BatchNormalization
?
adam is best for batch training , we are working on the NaN issue
@drujensen do you still get the same problem? Sorry for not answering sooner, it's been a while since we worked on SHAInet (busy with the startup)
We can close this. When I get some time, I will jump back on this and perform more testing.
I have been playing around with the titanic data and even submitted to Kaggle but my results are not so good. I'm ranking quite low (9511)
I believe my models are overfitting. I'm using
sgdm
and I have lowered thelearning_rate
andmomentum
to try and avoid this but the error and MSE start to rise and don't come back down.Is the
eraser
layer similar to aDropout
layer? If so, how do I use it?Is there a way to create a
BatchNormalization
layer?