How to determine hidden layer size and learning rate?

ottokart / punctuator2

A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text

http://bark.phon.ioc.ee/punctuator

MIT License

655 stars 195 forks source link

How to determine hidden layer size and learning rate? #37

Open wltz opened 5 years ago

wltz commented 5 years ago

python main.py I am wondering at first stage of training the language model, how to choose the hidden layer size and learning rate? Thanks!

ottokart commented 5 years ago

Hidden layer size 256 and learning rate 0.02 have worked fairly well for me in most cases. You can start with that. To find more optimal settings, you'll just have to experiment with different values and compare the results on dev/validation set. For small datasets you might want to reduce the hidden layer size, for larger datasets a bigger model might be better (but also slower).

wltz commented 5 years ago

Thanks ottokart!