domerin0 / rnn-speech

Character level speech recognizer using ctc loss with deep rnns in TensorFlow.
MIT License
77 stars 31 forks source link

train the model #34

Closed chenting0324 closed 7 years ago

chenting0324 commented 7 years ago

Hi, I want to know how long does it takes to train the model in your pre-trained example?you said build the model takes 30minutes in the condition of the train set are dev-clean and the test set are test-clean,right? If I don't have GPU,how long would it takes to bulid the model ,the train set are as follows : LibriSpeech's train-clean-100 LibriSpeech's train-clean-360 LibriSpeech's train-other-500 Shtooka's eng-balm-emmanuel_flac.tar Shtooka's eng-balm-judith_flac.tar Shtooka's eng-balm-verbs_flac.tar Shtooka's eng-wcp-us_flac.tar Shtooka's eng-wims-mary_flac.tar TED-LIUM's release 2

AMairesse commented 7 years ago

Hi,

The current pre-trained model took approximatively 22 days to train on a GeForce 960. Building the model is not that long, in my configuration at the time it took less than 1 minute. Training without a graphic card is not really usable. The difference between the CPU and a good GPU like the 960 is a one to ten ratio at least.

I'm currently training a new version with a 1080 Ti, it's been running for 9 days and getting to the point where I get a 25% CER on the test set. I hope that I will be able to get under 19,5% like the current one.

The size of the test set does not have an impact on the training (unless you use the current dev branch with size_ordering set to True, the scan of the training files for size ordering is not parallelized and can be time consuming). There are to parameters which make the model long to build (and train), those are :