Closed ichitaka closed 7 months ago
Sorry, I don't understand why you want to do this.
Alternates, as received from the underlying speech recognition engine, are usually alternate possible interpretations of the received speech. For example [ 'four', 'for', 'fir', 'fiord']. I don't see why you'd want them merged into a single string. Could you explain what you were trying to achieve with this?
That is sadly not true for the Web version at least. From my testing, alternates contains other pieces of the transcriptions that happen after a break.
So if I say: "Let me think about it.
I'm sure this isn't the most appropriate solution, but it fixes the issue of dictation over a long period, that I was facing. Maybe a platform interface solution makes sense here.
Oh! Good find, thank you. I'll have a look at that.
I've found another issue that causes early timeouts. This section does not make sense to me. As we stop the listen according to the variables _elapsedListenMillis and _elapsedSinceSpeechEvent, why do we need to update our reference values? _elapsedListenMillis & _elapsedSinceSpeechEvent are being updated already.
Tested on web and removing this section solves an issue of early timeout.
if (null != pauseFor) {
var remainingMillis = pauseFor.inMilliseconds -
(ignoreElapsedPause ? 0 : _elapsedSinceSpeechEvent);
pauseFor = Duration(milliseconds: max(remainingMillis, 0));
}
if (listenFor != null) {
var remainingMillis = listenFor.inMilliseconds - _elapsedListenMillis;
listenFor = Duration(milliseconds: max(remainingMillis, 0));
}
Just getting back to this and tested it on Chrome, and, of course, you're right! I had completely misunderstood the spec for the web version because I was trying to fit my experience from Android and iOS into the web framework. The web structure is more complicated. It provides a series of utterances in the first level results then a series of alternate in a second level set of results under each of the first results.
However, the fix for this isn't in the right place. The actual change should be in speech_to_text_web.dart. I'll make that change now and you can try it out from the repo.
I've found another issue that causes early timeouts. This section does not make sense to me. As we stop the listen according to the variables _elapsedListenMillis and _elapsedSinceSpeechEvent, why do we need to update our reference values? _elapsedListenMillis & _elapsedSinceSpeechEvent are being updated already.
Tested on web and removing this section solves an issue of early timeout.
if (null != pauseFor) { var remainingMillis = pauseFor.inMilliseconds - (ignoreElapsedPause ? 0 : _elapsedSinceSpeechEvent); pauseFor = Duration(milliseconds: max(remainingMillis, 0)); } if (listenFor != null) { var remainingMillis = listenFor.inMilliseconds - _elapsedListenMillis; listenFor = Duration(milliseconds: max(remainingMillis, 0)); }
The update of these properties fixed an issue identified in #191
There's a new version in the repo now that has my changes to improve web handling of multiple phrases. Let me know if you have a chance to try it.
These changes are ready for 6.4.0. I'm going to close this PR as it is now obsolete after those code changes.
If you have a chance please test the changes. If you want to have a look they are in speech_to_text_web.dart and balanced_alternates.dart.
This is the simple fix that solves the issue where the user takes a small break in speaking that is less than the configured pause. In that case the speech recognition would stop as after a break, further detection becomes part of the next element in alternatives. Right now I can't find further reasons why the alternatives were defined like this.
This has been tested on Web.