NervanaSystems / deepspeech

DeepSpeech neon implementation
Apache License 2.0
222 stars 69 forks source link

Validation result for a single file is not stable #46

Open duongquangduc opened 7 years ago

duongquangduc commented 7 years ago

I trained Librispeech data set of 960 hours for a week and got WER at 38% and CER at 12% with the validation data set dev-clean. My issue is, when I verified a single audio file randomly in the validation data set, the output result is quite different compared to the result evaluated from the whole data set.

For example, the transcript of audio file 84-121123-0007.flac is 'WHAT DO YOU MEAN SIR'. The transcript when evaluated from one sample 84-121123-0007.flac is 'AP WHA E MA EMSIR ', cer is 70% and wer is 100% , while the one from dev-clean set is 'WHAT DOYOU MEAN SIR'.

Could you please suggest?

Neuroschemata commented 7 years ago

Please post the exact command you used to train the model so that we can help diagnose.

duongquangduc commented 7 years ago

@Neuroschemata, This is the statement to train the model: python train.py --manifest train:/root/deepspeech/librispeech/train-clean-100/1000_hour_manifest.csv --manifest val:/root/deepspeech/librispeech/train-clean-100/val-manifest.csv -e7 -z32 -s /deepspeech/speech/model_ds2.pkl --model_file /deepspeech/speech/model_ds2.pkl

tyler-nervana commented 7 years ago

This issue is likely related to a bug in model serialization: https://github.com/NervanaSystems/neon/issues/359. We are working on a fix and will give you instructions on how to update when we get one out. Thanks for catching it!

duongquangduc commented 6 years ago

Hi @tyler-nervana, Is there any update for this request? Btw, do I need to retrain a new model when it is updated? Thanks!