Closed Jankaz2 closed 2 days ago
I think it's the API problem instead of my code as I copied the example from this repo and problem is the same. Could you somehow increase the length of possible silence on IOS (I completely do not know swift) or maybe you can suggest some workaround? Would be really grateful
Hmm that's quite an unusual issue. The continuous mode on iOS doesn't have any special config around speech silence and under the hood we're only just setting some flags (such as requiresOnDeviceRecognition
, addsPunctuation
, taskHint
, etc) for the underlying speech recognizer prior to starting.
I haven't got an iOS device to test this out on right now but will get back to you in the next day or two. If I were to guess though, this may just have something to do with network-based recognition and you may want to enable requiresOnDeviceRecognition
on iOS.
Could you provide me the following:
Thanks!
I will send the logs in the next few hours, but for now I think the problem may be related to the IOS version. The first phone where the problem occurs has IOS 18.0 installed. But now I have tested the feature also on a phone with IOS 15 and it works fine. Also, I previously used the react-native-voice library with IOS 17.x.x and it also worked fine. Maybe there are some changes in version 18.0?
@Jankaz2 Looks like you're right! https://forums.getdrafts.com/t/ios-18-macos-15-beta-warning-for-dictation-users/15334
Apple has introduced a bug in their speech recognition frameworks that renders it impossible to do long-form dictation when running on iOS 18 or macOS 15. This will appear in Drafts as data being discarded as you dictate and only the most recent utterance being retained.
Looks like it's on iOS 18.1 beta too. Here's a screenshot of one of the tickets that resembles your issue:
Ok, so I guess we just have to wait for bug fixes from Apple.
But I have another question, because now I'm testing this on various Android emulators / real device, and the speech recognition stops right after I stop talking for a millisecond, even though I have an additional Android configuration of
EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS: 10000
. Am I missing something?
edit: ok, I think the problem is also with android versions. It does not work on Android 12, but on v14 works good.
Hey @Jankaz2, unfortunately the only config that's possible for Android 12 and below are the following:
EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS
EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS
EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS
I should document this somewhere on the README, but it's quite well known that any kind of "continuous mode" using the Android SpeechRecognizer (at least for Android 12 and below) doesn't really work and online solutions propose just stopping & starting again. For Android 13+ I went a different route thanks to a newer API (EXTRA_AUDIO_SOURCE
) which allowed me to hook up a custom audio recorder to the speech recogniser to avoid these limitations. I think this is the only public repo that actually does something like this.
On Android 12 and below, I've configured each of these settings to be set to 600 seconds for continous mode (as we intend that it goes indefinitely), however they don't seem to have any effect at least on Android 12. On an Android 12 (API 31) emulator you'll likely see the logcat message: EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS can't be used when EXTRA_PREFER_OFFLINE is false
. However this doesn't seem to be applicable for Android 14. I can't even verify whether offline speech recognition works at all for Android 12 either.
So I think the best route you'll have is to just force start the speech recognizer again after it stops as a "hack". Unfortunately this is probably the best you'll get with the current API limitations for Android 12.
There's a hacky workaround that I'll be exploring to fix the iOS 18 issue that involves checking whether speechDuration
is a positive number. It seems like the Apple engineers intended that this is a final result, so I'll be emitting a result event with isFinal: true
here. That way, there shouldn't be any need for further changes on your end.
Cool, let me know when you will release a new version :)
@Jankaz2 I've just released a new version at expo-speech-recognition@0.2.20
. Let me know if that solves the issue for you.
yes, it works perfectly. thanks a lot :)
Sweet! Closing this issue. I might open up an issue for the Android <=12 continuous recording issue, but for the time being I've added a note to the README.
Hi, first of all thanks for this library, because it works much better than
react-native-voice
. Congratulations <3I have a small problem, because when I test it on the IOS emulator, it works perfectly. I can pause while speaking and everything is read correctly. However, on the real device, when I take a pause while speaking, then the results after the pause overwrite the results before the pause.
Here is my config
And the videos presenting the problem
https://github.com/user-attachments/assets/c2876b61-6904-4b0b-a66c-5949e8530f5b
https://github.com/user-attachments/assets/1adcd7a2-2cc4-41d4-927e-0951225d8ba6