mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.37k stars 3.97k forks source link

TEDLIUM2 cannot converge in training #1509

Closed meixitu closed 6 years ago

meixitu commented 6 years ago

I download this code at this week from master I tried commonVoice dataset and tedlium2 dataset, both of them can't converge

I use this run-ted.sh to run the ted file, this is the final command. image

this is the log, it seems loss is not reduced.

image

reuben commented 6 years ago

The bin/run-* scripts were tested at some point in the far past. Since then the architecture has gone through major changes and it's quite likely that the hyperparameters no longer make sense. I don't think we'll be training on individual datasets just for the sake of maintaining those files, so maybe we should remove them to avoid the misdirection, and document only the hyperparameters we actually test (e.g. for the release models). @kdavis-mozilla @lissyx what do you think?

meixitu commented 6 years ago

I think I make a mistake. TED dataset is converged after 10 epoch training. because in the training process, WER, src, res are not showed. So I guess it is not converged because the loss is not reduced significantly. image

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.