csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
351 stars 218 forks source link

[plugin] INVOKE FLUTTER SOUND LEVEL CHANGE ? #266

Closed sylvainjack closed 2 years ago

sylvainjack commented 2 years ago

I am using your package in a vocabulary learning app. I would like to make sure that I use it right as what I get in the console seems a bit strange :

11 [plugin] invokeFlutter soundLevelChange [plugin] HypothesizeTranscription [plugin] Encoded JSON result: {"alternates":[{"recognizedWords":"La","confidence":0}],"finalResult":false} [plugin] invokeFlutter textRecognition flutter: Voici les résultats de generateList : [SpeechRecognitionWords words: La, confidence: 0.0] 3 [plugin] invokeFlutter soundLevelChange [plugin] HypothesizeTranscription [plugin] Encoded JSON result: {"alternates":[{"recognizedWords":"La salle de","confidence":0}],"finalResult":false} [plugin] invokeFlutter textRecognition flutter: Voici les résultats de generateList : [SpeechRecognitionWords words: La salle de, confidence: 0.0] 4 [plugin] invokeFlutter soundLevelChange [plugin] HypothesizeTranscription [plugin] Encoded JSON result: {"alternates":[{"recognizedWords":"La salle de bain","confidence":0}],"finalResult":false} [plugin] invokeFlutter textRecognition flutter: Voici les résultats de generateList : [SpeechRecognitionWords words: La salle de bain, confidence: 0.0] 12 [plugin] invokeFlutter soundLevelChange [plugin] invokeFlutter notifyStatus [plugin] invokeFlutter soundLevelChange 16 [plugin] invokeFlutter soundLevelChange [plugin] FinishRecognition true [plugin] Encoded JSON result: {"alternates":[{"recognizedWords":"La salle de bain","confidence":0.898},{"recognizedWords":"La salle de bains","confidence":0.728}],"finalResult":true} [plugin] invokeFlutter textRecognition [plugin] FinishSuccessfully flutter: Voici les résultats de generateList : [SpeechRecognitionWords words: La salle de bain, confidence: 0.898, SpeechRecognitionWords words: La salle de bains, confidence: 0.728] [plugin] invokeFlutter notifyStatus

I keep getting a large number of "invoke flutter soundLevelChange", sometimes the number keeps increasing and never stops.... Here it went up to 2172 before I had to terminate the app.

3 [plugin] invokeFlutter soundLevelChange [plugin] HypothesizeTranscription [plugin] Encoded JSON result: {"alternates":[{"recognizedWords":"La salle de bain","confidence":0}],"finalResult":false} [plugin] invokeFlutter textRecognition flutter: Voici les résultats de generateList : [SpeechRecognitionWords words: La salle de bain, confidence: 0.0] 15 [plugin] invokeFlutter soundLevelChange [plugin] invokeFlutter notifyStatus [plugin] invokeFlutter soundLevelChange 2172 [plugin] invokeFlutter soundLevelChange Application finished.

What does this mean exactly ? I set the listen duration to 4 seconds.

Here is the code : Future generateRecognizedWordList() async { final SpeechToText speech = SpeechToText();

await speech.listen( onResult: createList, listenFor: Duration(seconds: 5), pauseFor : Duration(seconds:5), partialResults: true, localeId: 'fr_FR', cancelOnError: true, listenMode: ListenMode.confirmation); }

void createList(SpeechRecognitionResult result) { print('Voici les résultats de generateList : ${result.alternates}'); }

The idea is that this function will return a list of the words that were recognized. Apparently, it never stops the listening session. It should stop after 3 seconds. I tried adding a speech.stop() function but it hasn't changed anything. Where should this function be added ?

I changed listenFor duration to 5 and pauseFor (Duration : 5) and now it seems to work...

Can someone help me ? Thank you very much, Sylvain

sowens-csd commented 2 years ago

Yes that log looks right, that's the debug logging on iOS. You can shut it off in the console if you turn off debug level logging.

There's no need to set a pauseFor value if you want it to be the same as the listenFor value so you can omit one or the other. I had thought I'd fixed the bug that happened when the two were set to the same value but perhaps there's still something wrong there, I'll have a look.

sylvainjack commented 2 years ago

Thanks for your answer :) I am trying to understand how the listen function works : does it call the "onResult" function several times ? or just one time once the result is final ?

What I want to do is when the listening session is finished (5 seconds) : I want to create a list that contains the words that were recognized. For example if I phonetically say the word "see" : I would like a list of Strings containing ["see","sea"]. To do that, if the "onResult" function is called many times, I thought I would include in the "onResult function" : 1) A test (if result.final == true) 2) then a for loop in result.alternates to fill in a set with only the words contained in the result.alternates.

Is this the way to go about it ?

Thanks for your help, Sylvain

sowens-csd commented 2 years ago

You should be able to do that by setting listenFor to 5 seconds and don't set pauseFor. Then set partialResults false in the listen call. With that set you'll only get a callback with the final results. Then iterate through the results.alternates and create the list of strings that you want. Let me know if you still have questions.

sylvainjack commented 2 years ago

Thanks a lot for your help :) I think I managed to reach my goal. Here's the code : ` void listenToUser() { speech.listen( onResult: generateRecognizedWordList, listenFor: Duration(seconds: 3), localeId: _currentLocaleId, cancelOnError: false, partialResults: false, ); }

void generateRecognizedWordList(SpeechRecognitionResult result) { _wordList.clear(); for (SpeechRecognitionWords words in result.alternates) { _wordList.add(words.recognizedWords); } }`

I still have some questions :)

  1. Is there another solution than my for loop to get the recognized words in a list ?
  2. I don't seem to encounter the problem that the listening session never ends, have you changed something ?
  3. What does the "cancelOnError" parameter do ? What about the "listening.mode" ?
  4. Should my function "listenToUser" be asynchronous ? Should I add "async / await" ?

Thanks again :) Sylvain

sowens-csd commented 2 years ago
  1. yes, you could use something like: result.alternates.map((alt) => alt.recognizedWords).toList(); but there's nothing wrong with your for loop.
  2. no, I haven't changed anything. I think it might be an issue when pauseFor and listenFor have the same value.
  3. It automatically cancels the listen session if any error occurs. It is a convenience so that you don't have to manually call the cancel method in the error handler.
  4. Unless you want to wait for that method to complete there's no need for async.
sylvainjack commented 2 years ago

Thank you very much for your answers. Maybe one last element : 1) What does the "onSoundLevelChange" callback do ? When is it useful to use it ? 2) Same thing with ListenMode : listenMode.confirmation ? Sylvain

sowens-csd commented 2 years ago
  1. It is called with the current sound level (loudness) each time there is a change to that. It can be used to detect if someone might be speaking and then do an animation or other UX interaction to show that the device is listening.
  2. listenMode gives a hint to the OS speech recognizer about the type of content that is expected. confirmation is used for short confirmation style requests of usually just a few words.
sylvainjack commented 2 years ago

Do you think I should add the listemode parameter in my case ? Would it bring anything ?

sowens-csd commented 2 years ago

Probably yes. I'd suggest you use ListenMode.dictation.

sylvainjack commented 2 years ago

Sorry I didn't respond sooner. Thank you very much for all your help. Is there a documentation somewhere that describes what the parameters do ? (Like the listenModes etc...) I found examples on pub.dev, but no real documentation. Maybe I do not know where to look.

sowens-csd commented 2 years ago

The parameters are all documented in the method documentation. You would see the docs in VSCode if you have the Flutter extension or from pub.dev in the API reference at the method level. Here's an example:

https://pub.dev/documentation/speech_to_text/latest/speech_to_text/SpeechToText/listen.html