Open farzaneHajipour opened 7 years ago
Very likely, yes. The evaluation of the loss function scales basically quadratic with the length of the sequence. You could try cutting it down, or switch to a higher frame rate, like we do in the "3x" recipes, where we extract features every 10ms, but create three new utterances with 30ms frame rate from one original utterance (with 10ms frame rate), one utterance shifted by 0ms, one by 10ms, and one by 20ms. This is a form of data augmentation and makes training much faster.
Hi I'm using eesen to train the data I have. The model trains successfully but too slowly. Is it possible that the reason of low speed is the mean duration of utterance which is 1.5 min? Thank you!