Closed joshuachristie closed 3 years ago
Just a note that I don't think it makes sense to try and do a a single massive hyperparameter optimisation search at once. For one, AutoML approaches sometimes seem underwhelming compared to manual tweaking, so there's room for using some common sense to constrain search space. It can also lead to more efficient use of resources as I can train for longer as I get progressively closer to a (near-enough) optimal design. Second, and more important, running a huge hyperparameter optimisation search at once will cause overfitting of the validation set (since this is being used to select hyperparameters). Since data is cheap in my case, I should be generating a new train/valid set after each step of the hyperparameter optimisation pipeline.
I did run a few keras tuner searches. Generally didn't lead to much improvement, however, it did get good results with higher learning rates than I had been trying. Didn't end up using any of the hyperparameter combinations it found---I was only testing cuDNN compatible architectures in any case---but I did implement the higher learning rate for the final model (combined with layer normalisation and recurrent dropout this led to very stable training)
https://github.com/keras-team/keras-tuner