mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.44k stars 3.98k forks source link

v0.6.0 TFLite regresses in quality? #2612

Closed lissyx closed 4 years ago

lissyx commented 4 years ago

Not sure exactly why, but it seems reported by people. https://discourse.mozilla.org/t/android-project-with-pb-files-instead-of-tflite/50550/17

Checking myself for IoT demo: There's definitively something going on here ...

$ ~/tmp/deepspeech/0.6.0/tfpb/deepspeech --model ~/tmp/deepspeech/0.6.0/eng/deepspeech-0.6.0-models/output_graph.pbmm --lm limited_lm.binary --trie limited_lm.trie --audio deepspeech_dump_all.wav -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
2019-12-19 16:08:41.433573: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
turn the bedroom lamp light on
cpu_time_overall=7.25584
$ ~/tmp/deepspeech/0.6.0/tflite/deepspeech --model ~/tmp/deepspeech/0.6.0/eng/deepspeech-0.6.0-models/output_graph.tflite --lm limited_lm.binary --trie limited_lm.trie --audio deepspeech_dump_all.wav -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
on the bedroom light on
cpu_time_overall=12.07772

And without LM:

$ ~/tmp/deepspeech/0.6.0/tflite/deepspeech --model ~/tmp/deepspeech/0.6.0/eng/deepspeech-0.6.0-models/output_graph.tflite --audio deepspeech_dump_all.wav -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
no the bederorond lighte pon
cpu_time_overall=12.68055
$ ~/tmp/deepspeech/0.6.0/tfpb/deepspeech --model ~/tmp/deepspeech/0.6.0/eng/deepspeech-0.6.0-models/output_graph.pbmm --audio deepspeech_dump_all.wav -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
2019-12-19 16:11:19.134152: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
 ter de bedroom along light on
cpu_time_overall=8.49712

And re-using v0.6.0-alpha.15, which is 0.5.1 re-exported:

alex@portable-alex:~/codaz/Mozilla/DeepSpeech/moziot-deepspeech$ ~/tmp/deepspeech/0.6.0/tflite/deepspeech --model ~/tmp/deepspeech/0.6.0a15/en-us/output_graph.tflite --lm limited_lm.binary --trie limited_lm.trie --audio deepspeech_dump_all.wav  -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
the the bedroom lamp light on
cpu_time_overall=12.38057
alex@portable-alex:~/codaz/Mozilla/DeepSpeech/moziot-deepspeech$ ~/tmp/deepspeech/0.6.0/tflite/deepspeech --model ~/tmp/deepspeech/0.6.0a15/en-us/output_graph.tflite --audio deepspeech_dump_all.wav  -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
derm the bed room lont light on
cpu_time_overall=12.50523

Audio was recorded from Firefox Nightly, on RPi4 via WebSocket.

lissyx commented 4 years ago

Re-exporting the model with different TFLite conversion parameters does not change.

lissyx commented 4 years ago

Running 0.6.0a15 model with 0.6.0 binaries yields ""good"" results, as exposed above. Running 0.6.0 model with 0.6.0a15 yields ""bad"" results. I think this advocates that something in the model changed and TFLite export impacts it :/.

lissyx commented 4 years ago

I'm wondering if (and if, why) we would need adjustements on the LM weights for TFLite:

$ ~/tmp/deepspeech/0.6.0/tflite/deepspeech --model ~/tmp/deepspeech/0.6.0/eng/en-us/output_graph.tflite --lm limited_lm.binary --trie limited_lm.trie --audio deepspeech_dump_all.wav --lm_alpha 2.0 --lm_beta 1.0 -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
turn the bedroom light on
cpu_time_overall=11.44250
$ ~/tmp/deepspeech/0.6.0/tflite/deepspeech --model ~/tmp/deepspeech/0.6.0/eng/en-us/output_graph.tflite --lm limited_lm.binary --trie limited_lm.trie --audio deepspeech_dump_all.wav -t
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.0-0-g6d43e21
INFO: Initialized TensorFlow Lite runtime.
on the bedroom light on
cpu_time_overall=11.42704
lissyx commented 4 years ago

As @reuben suggested on IRC, this might be a side-effect of CUDNN and incomplete testing on my side for TFLite model.

lissyx commented 4 years ago

In the meantime of a new release, re-exporting the model from checkpoint with patch from #2613 should work.

lock[bot] commented 4 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.