biemster / gasr

Google Chrome SODA Offline Speech Recognition command line client
https://hackaday.io/project/164399-android-offline-speech-recognition-natively-on-pc
149 stars 18 forks source link

When is "isFinal" in resultHandler true? #2

Closed goddade closed 3 years ago

goddade commented 3 years ago

void resultHandler(const char text, const bool isFinal, void instance) Why the "isFinal" I get is always False?

biemster commented 3 years ago

The isFinal bit will be set by the model you are using, and depends on the utterance you're producing. I noticed that the older models indeed set this boolean less frequent, and non-native English speakers will also find that the model has difficulty predicting the end of an utterance since it depends on word order. So a fix would be to take one of the latest gboard models, and feed it utterances in the proper language the model is trained for.

goddade commented 3 years ago

Does "gboard models" mean to download models through android gboard? I got the downloaded model in the google quick search box data directory, and found that the file structure is quite different from SODAModels. I use gtts as input, does this matter?

biemster commented 3 years ago

gtts gives very good and clean results for me, so that's a good test case indeed. With gboard models I indeed mean the models gboard uses, they work with the soda library as well without any modification even though the file structure is different. Just point gasr to the model directory. The gboard apk has a link to a superpacks manifest json in them, I mention this on the project page. I'm not sure about the quick search box models, just try and if it works it works :+1:

goddade commented 3 years ago

The version of gboard I use is 10.1.04.342850159-release-arm64-v8a, no offline options for voice input. I updated the model to version 1000 and it works. but "isFinal" is still false. The new model is very unstable, often “Segmentation fault”.

biemster commented 3 years ago

the gboard version is not what is important here (and how/what did you update to "version 1000"?), what's the date string in the link to the superpacks manifest json?

goddade commented 3 years ago

The version and size will be displayed when downloading the offline model in the voice input method. Version 1000 is the latest version of en_us. The result of grep -r superpacks_manifest.json: ` smali/cjo.smali: const-string v2, "https://www.gstatic.com/android/keyboard/modelpack/expressive_concepts/2020031023/superpacks_manifest.json" smali/cjo.smali: const-string v2, "https://www.gstatic.com/android/keyboard/modelpack/expressive_concepts_triggering/2020032611/superpacks_manifest.json" smali/cjo.smali: const-string v2, "https://www.gstatic.com/android/keyboard/modelpack/transformer_concept/2020032617/superpacks_manifest.json" smali/cjo.smali: const-string v1, "https://www.gstatic.com/android/keyboard/modelpack/lite_emoji_predictor/2020091814/superpacks_manifest.json" smali/cpk.smali: const-string v2, "https://www.gstatic.com/android/keyboard/langid/20191018/superpacks_manifest.json" smali/ddm.smali: const-string v1, "https://www.gstatic.com/android/keyboard/modelpack/contentcache/202010191648/superpacks_manifest.json" smali/gdq.smali: const-string v1, "https://www.gstatic.com/android/keyboard/modelpack/federatedc2q/superpacks_manifest.json" smali/hhr.smali: const-string v1, "https://www.gstatic.com/android/keyboard/modelpack/theme_indices/201903111437/superpacks_manifest.json"

`

biemster commented 3 years ago

Try grepping for superpacks-manifest (mind the dash instead of underscore, and don't append .json). And after you found the recognizer manifest, replace the date string in it to find newer ones. I'm closing this issue now since it's not about the result handler anymore, feel free to continue discussing on the HaD project page.