train the model - Githubissues

Hi,

The current pre-trained model took approximatively 22 days to train on a GeForce 960. Building the model is not that long, in my configuration at the time it took less than 1 minute. Training without a graphic card is not really usable. The difference between the CPU and a good GPU like the 960 is a one to ten ratio at least.

I'm currently training a new version with a 1080 Ti, it's been running for 9 days and getting to the point where I get a 25% CER on the test set. I hope that I will be able to get under 19,5% like the current one.

The size of the test set does not have an impact on the training (unless you use the current dev branch with size_ordering set to True, the scan of the training files for size ordering is not parallelized and can be time consuming). There are to parameters which make the model long to build (and train), those are :

the max_input_seq_length : you should set it to a value allowing to use the most of the training set you are using (see comments in config file).
the batch_size : you should set it to the higher value possible for you setup. I recommend you to start with a value of 10 and let the training run for 50 steps. If it's ok you could try a higher value, if you get OOM : out of memory errors you should try a lower value until it works. You should then set the mini_batch_size to a value so that mini_batch_size x batch_size > 10, ideally somewhere between 20 and 30.

domerin0 / rnn-speech

train the model #34