Open alinaselega opened 9 years ago
Testing a model based on alexnet_based.yaml with increased epoch at which learning rate (and momentum) saturate and smaller scaling factors for the learning rate. Also monitoring valid_y_nll.
valid_y_nll was looking pretty low after only a few epochs but the model scored slightly worse on the holdout set than the original alexnet_based. After letting it run for longer, the score actually worsened.
Now trying two models, both with saturating epoch = 100 (4 times larger than the original) and original scaling factors for learning rate (0.9 and 1.1). One model still monitors valid_y_nll and the second monitors valid_objective again. Pending results.
The model that was monitoring valid_objective did a little better. Currently running a variation of the current best model (alexnet with extra convolutional layer and 8-factor augmentation) with the epoch of learning rate and momentum saturation set to 100.
The model didn't do better than the current best after 100 epochs.
Make the learning rate decay slower as it might be currently getting too small.