Open everthemore opened 3 years ago
So i tried both of these tests, in summary:
the performance of the network is decent for # of hidden noddes down to 20. a second hidden layer improved the network quite a lot. without further training because the early stopping criterion is reached.
for testing the # of hidden nodes: I used EarlyStopping from keras, monitoring val_loss and patience=50, running 1000 epochs typically did not make it stop early. These are plots of the loss as a function of hidden nodes, and a typical run.
For testing a second hidden layer, i made hidden layer equivalent to the first and trained on that with the same early stopping and epochs, it stopped early in most cases actually reducing the time spent training compared to single hidden layer. And simultaneously improved the performance on the test data. this is a typical run of 2 hidden layers.
Great, so two layers with the same number of neurons achieves a lower loss, better generalization and even trains faster? :)
i think the improvement in training time is not that much, but going from 200s to 120s is still an improvement.
A simple feedforward network with 10 -> 100 -> 1 structure does a really good job already, so we don't even have to try more complicated network types. But we should: