srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
822 stars 342 forks source link

Added a frames-per-sec option to train-ctc-parallel #101

Closed bmilde closed 7 years ago

bmilde commented 7 years ago

so that the corpus size estimates (in h) are correct in the log when a different frame rate than 100 is used in a setup.

fmetze commented 7 years ago

great, thanks - will look at this ASAP. For now, it is not so much a concern for us because when we train with e.g. a 30ms frame shift, we also do 3-fold over-sampling (with offset of 0, 10, and 20 milliseconds), so that overall the duration reported after all the data has been processed is actually correct without this patch.

bmilde commented 7 years ago

Only a minor addition, the reason I've added it is related to my setup of the 30ms frame shift. I've randomized the offsets, so that overall amount of training data stays the same, but the model will still learn from different offsets. But this results in a framerate of 33.3 without oversampling.

riebling commented 7 years ago

Confident these won't break anything, testing in progress