CommunityToolkit / dotnet

.NET Community Toolkit is a collection of helpers and APIs that work for all .NET developers and are agnostic of any specific UI platform. The toolkit is maintained and published by Microsoft, and part of the .NET Foundation.
https://docs.microsoft.com/dotnet/communitytoolkit/?WT.mc_id=dotnet-0000-bramin
Other
2.8k stars 277 forks source link

SpeechToText.ListenAsync reports twice #844

Open sej69 opened 4 months ago

sej69 commented 4 months ago

Describe the bug

iOS (iPAD 10)

There could be two issues here.

First, I don't know if it's a typo by Vladislav Antonyuk but on this page: https://devblogs.microsoft.com/dotnet/speech-recognition-in-dotnet-maui-with-community-toolkit/

It refers to using the listenAsync like this:

var recognitionResult = await SpeechToText.Default.ListenAsync( CultureInfo.GetCultureInfo("uk-ua"), new Progress(partialText => { RecognitionText += partialText + " "; }), cancellationToken);

The recognitionText += partialText + " ";

will duplicate the sentence / phrase being spoken. It's like it's meant to spit out word by word instead of repeating words.

And since the recognitionComplete event is not firing, I've reverted to using timers. But I noticed another weird issue. I have another open case on here regarding that. I can get the recognitionComplete to fire, only if I watch with a breakpoint on the recognitionUpdated event. But that's for the other bug report.

If I say one word like "test", or "one" the SpeechToText will fire off this word twice which is causing issues with my events handing the incoming test.

Regression

No response

Steps to reproduce

.net 8, Maui

This is my code:

            var recognitionResult = await speechToText.ListenAsync(CultureInfo.GetCultureInfo("en-us"),
                new Progress<string>(partialText =>
                {
                    Debug.WriteLine("--- " + partialText + "---");

                    RecognitionText = partialText + " ";

                }), cancellationToken);

I say "one" and in the debug window I'll see:

--- One---
--- One---

There is about a 1-2 second pause between the two displays.

If there was a way to clear out the buffer that would fix it, but I have a feeling that this behavior isn't supposed to be occurring.. 

Incidentally, the await speecToText.ListenAsync continuously listens, it never returns so I cannot use the "recognitionResult".  Probably because of the fact that it doesn't ever hit the OnRecognitionTextComplete event...?

Expected behavior

Should only report once. I'm also thinking it should report the words as it sees, not the full sentence each time. And if it is supposed to be reporting the full sentence, then it should only report once.

Screenshots

No response

IDE and version

VS 2022

IDE version

17.9

Nuget packages

Nuget package version(s)

mvvm - 8.2.2, maui - 7.0.1

Additional context

No response

Help us help you

No, just wanted to report this

sej69 commented 4 months ago

I moved off of the interface and went to directly accessing the class instead. And now, it appears to be reading one word at a time. I wonder if this was my issue with the events as well. I won't have a chance to look at that until next week now.