csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
351 stars 218 forks source link

Unexpected crash on iPhone when speech initializes #100

Closed tamoyal closed 3 years ago

tamoyal commented 3 years ago

Hi! I'm getting crashes which seem to happen on the first invocation of the speech recognizer for an unpredictable set of users (iOS is up to date for them). I attached a crash log and was wondering if you have any hunch as to the reason for the crash or pointers on where to look. Thanks!

crashlog.txt

sowens-csd commented 3 years ago

Thanks for the log. Which version of speech_to_text are you using? Are they all iPhone 7 Plus that are having the issue?

I can tell where it is crashing from that log, but that code would be the same for all devices and OS versions. If you have a device where you can reproduce the crash then we could try other versions of the plugin to see if it is a recently introduced problem or has always been there but is only triggered by some devices.

sowens-csd commented 3 years ago

If you have a reproducible case I'd love to have the log from a device that's crashing. There might be some log entries that would shed some light on the issue.

You're confident that it is always on the very first invocation of the speech recognizer? That's important because one possible cause is failing to remove the tap from a previous session. Are you using anything else in your app that would be listening on the microphone? There could be a conflict if so.

tamoyal commented 3 years ago

@sowens-csd Thanks for the fast response. I can confirm it's not just iPhone 7's, it also happened on an iPhone 8. We don't have any crashes reported on very new iPhones although I can't say conclusively that won't happen. I can also confirm the crash is happening on first invocation but I cannot yet confirm it would not happen on a later invocation.

I initialize the speech recognizer when the view that uses it appears, however, given it crashes on first invocation, we can probably rule that out as something to think about. Our app could technically have audio playing when the speech recognizer initializes but I no other competition for the microphone. I have also tested tapping the mic button (which initializes speech) while audio is playing and it is not an issue.

Unfortunately this is working great on my phone but I will look at obtaining a device log from the next person who experiences this. Please let me know if you have any ideas. I'm happy to dig in deeper. Thanks!

tamoyal commented 3 years ago

I've been trying to gather more information on this issue. It seems device and iOS version actually don't matter. We see the crash spread out. It also occurs on the first invocation of the speech recognizer (after permissions are granted). The latest crash report looks exactly like the one previously attached. Any hints would be awesome. I'm happy to dig into this. I don't yet have a device in hand I can reproduce the issue on (and I've tried on 6 devices) but our users are getting the crashes.

sowens-csd commented 3 years ago

Thanks @tamoyal. For a user that is seeing the crash, do they see it reliably every time they run the app? Or it happens occasionally but other times speech recognition works properly?

tamoyal commented 3 years ago

The one we just tested with saw it reliably. Other times we have had users upgrade iOS and then they can't reproduce, which I'm not convinced has anything to do with the iOS upgrade (our last crash user is on 13.5.1 and reproduced it 3 times)

tamoyal commented 3 years ago

Ok I found the issue. The tests we are running involve a voice call and users are trying to use the app while on the call. If I make a whatsapp call and then hit our mic button which initializes speech, we get a crash. iMessage handles that with dictation by just sorta silently failing (not ideal but better than a crash). So the good news is I can reproduce this now and it's not a major bug because if someone is trying to use our app while on a call (in real life), they are doing it (life) wrong lol. But I would like to figure out a more graceful way to deal with this. Let me know if you have any ideas. Happy to contribute to the codebase. This SO post hinted me to consider this - https://stackoverflow.com/questions/44438649/ios-speech-to-text-avaudioinputnode-random-crash so maybe their suggestion would prevent the crash on your end. I would actually like to be able to tell if the audiounit is busy and if so, pop up a friendly message to the user saying they can't use STT while on a voice call.

sowens-csd commented 3 years ago

I'm pretty sure I know what's happening, based on the stack trace you provided, I just don't know why.

Does your app stop/start the speech recognition quickly? I'm wondering if there's some cool down period where the audio tap hasn't been successfully removed yet before the next one is added. Unfortunately, even if true, that wouldn't explain it failing on the initial install. That's a very strange symptom. I'm just putting these thoughts here in case they give you an idea.

I'm back looking at the code to see what I can do. One thing would be to see if I can at least handle the failure so that even if users don't have speech recognition the app wouldn't crash.

sowens-csd commented 3 years ago

Thank you! That's awesome. I'm pretty confident that with that information I'll be able to stop the crashing and come up with a useful failure behaviour. Probably an error on the callback something like 'audio busy' or something. I'll try to reproduce with other audio apps. Thanks for digging into this, it will definitely make the plugin better.

That SO post you linked to does look useful. I'll try to reproduce and try out that fix.

sowens-csd commented 3 years ago

Could you try to reproduce the failure with the version in the new try-tap-experiment branch? The behaviour I'm hoping for is that you get a failure from the call to listen but the app no longer crashes.

If you see an error you might have to run

pod install

in the iOS directory of your app, not positive. Flutter should handle it for you but let me know if you do have to.

tamoyal commented 3 years ago

Thanks for the fast improvement! It doesn't crash now but it does seem to fail silently. I still get the speech status as "listening" and no error callback even though the speech recognizer is not working (results don't come in). Let me know if you want me to test again

sowens-csd commented 3 years ago

There's an update available now on that same branch. The behaviour should be that in the failure case the status doesn't switch to 'listening' and the listen method throws a ListenFailedException. If you want to give the user feedback you can catch that exception and provide help.

Unfortunately I still can't reproduce the problem as I don't have WhatsApp. Neither FaceTime nor Signal voice seems to show the same behaviour. Let me know if it works in the failure case.

sowens-csd commented 3 years ago

Just wondering if you'd had a chance to try this version?

tamoyal commented 3 years ago

@sowens-csd Sorry for the delay. Busy week! Just tested it and I get an exception here: throw ListenFailedException(e.details); in your listen function. Error details are The operation couldn’t be completed. (Try error 1.). Better than failing silently for sure but maybe violates the callback paradigm you have set for the library?

I should respond faster here if you want me to keep testing. Also just FYI - the way I test this is I put Skype on my phone and my computer which only takes a few minutes (you need 2 accounts). Then I just initiate a call and use my app. Happy to keep testing for you though! Thanks again

sowens-csd commented 3 years ago

You're right it is a different flow than other interactions. Because this error meant that the listen couldn't start I'd thought perhaps it was better to deal with it inline rather than through a callback. Pretty easy to switch it to the error callback if that would support you in handling the issue. The good news is that the exception happened as planned so I now have control of the error condition.

Your preference then would be to return false from the listen method and generate an event for the onError handler? 'error_listen_failed' for now. I can add an enum or other details to a future release.

tamoyal commented 3 years ago

I'm cool with either actually. Maybe just documentation on how to handle it would be ideal. I could also help with that.

I wonder what happens if a call interrupts the STT though. Maybe in that case we want the onError handler saying we had to cut it off or something. Want me to test that?

sowens-csd commented 3 years ago

I've just committed an update that switches the response to a false return on the listen method and an onError callback with 'error_listen_failed'. Could you try that out?

Thanks for the details on a repro with Skype. The downside is that I have to reinstall the Skype virus on a system here, but at least it's a way to reproduce.

tamoyal commented 3 years ago

Well interestingly enough, what's happening now is it's actually working during the Skype call but then after I end the Skype call, it stops working and I can't get it to start again. It is disabled until I restart the app. Also the level callback is now working which is pretty awesome because it wasn't before. Thanks!

sowens-csd commented 3 years ago

When the Skype call ended was the app actively listening? Or you tried to start a new listen session after it ends and that fails? Can you give me the sequence that's happening? It sounds like maybe the Skype issue is different than the previous issue you were seeing with WhatsApp? At least I can't yet see how a failure in the installTap method that was causing the crash previously could cause the symptoms you describe.

When you say 'disabled' do you mean that calls to listen don't work anymore or something else? Does listen return false or throw an exception? Is the state of the 'isListening' property true or false? Any error callbacks?

tamoyal commented 3 years ago

Ok this time it did not work when the skype call was on. Maybe that depends when it is started. Here are sequences

Sequence 1: 1) Open my app and use it normally - speech.initialize and speech.listen called. Works great 2) Initiate Skype voice call 3) Press my record button which calls speech.listen - does NOT work and these are my logs:

flutter: _startListening
flutter: Speech status: notListening
flutter: SpeechRecognitionError msg: error_listen_failed, permanent: true

4) End the skype call 5) Enter the screen that calls speech.initialize and speech.listen 6) Press the record button and these are my logs:

flutter: _startListening
flutter: Speech status: listening
flutter: Speech status: notListening

(notListening is me releasing the mic) HOWEVER while these logs make it sound like the program is listening, I never get the call to the result listener, so it's not actually working.

Sequence 2: 1) Start skype call 2) Open my app and hit the record button - speech.initialize and speech.listen called. It still works. My logs:

flutter: _startListening
flutter: Speech status: listening
flutter: level: -64.01325988769531
...
flutter: level: -40.16642761230469
flutter: lastWords = Gracias (final: false)
flutter: _cancelListening
flutter: Speech status: notListening

3) End the Skype call 4) Hit the mic button (ONLY speech.listen called) and DOES NOT work. Here are the logs:

flutter: _startListening
flutter: Speech status: notListening
flutter: SpeechRecognitionError msg: error_listen_failed, permanent: true
flutter: Speech status: notListening

Does this help?

sowens-csd commented 3 years ago

That helps yes, thanks. One thing about your sequence #1 point 5 above. The Initialize call should only ever be called once per application, it is ignored after that. It's not a big deal except that your error handling and status listeners won't get updated if they changed since the first invocation. It's one of the reasons I created the new SpeechToTextProvider to help with that lifecycle issue. I don't understand why your sequence two doesn't work. Is it true that no listen call works from then on until the app is restarted? Or is it only the one call right after ending the Skype call that is not working?

I'm going to add some more logging to the OS version and have you run through the test again.

sowens-csd commented 3 years ago

Actually there could already be some interesting logging there. Could you open a Console session for the device you're using and see what logs are happening during the failure case?

sowens-csd commented 3 years ago

I got over my aversion to having Skype on my devices and have reproduced the error. Not sure what the problem is yet but at least I can reproduce locally, that should help.

sowens-csd commented 3 years ago

I've just committed a version that seems to address the problem with Skype. I'm not quite sure why, other than that it's checking for available input channels before installing the tap. It may well be that there's some side-effect of doing that check that is helping. Try this version out and let me know what you think.

Skype calls seem to cause a problem whether or not you use the SpeechToText while it's active. If you initialize SpeechToText then start then end a Skype call the call to listen will fail. That is true even if listen has not been called before. Subsequent calls work properly, implying that the listen/stop sequence corrects whatever is wrong.

I believe at this point that the original problem of crashing is resolved. My feeling is that we should close this issue as long as there are no crashes. It may be necessary to open another issue for handling post Skype cleanup if this fix doesn't resolve that for you but I'd rather not have single issues with multiple problems in them. It makes them harder to keep track of.

tamoyal commented 3 years ago

I just tested the following sequence: 1) Initiate a skype call 2) Run my app and successfully use speech recognition 3) End the Skype call from my computer (staying on the same screen on my app) 4) Speech recognition fails

This is basically what you're referring to above right?

Would it make sense to allow init to run again in the event of a permanent error?

Also worth mentioning that if you ignore init calls, might be good to log a warning so the user knows. I still get callbacks after calling init multiple times and those callbacks are in new instantiations of a widget so I'm not seeing the ignore behavior you are describing where I won't get invocations of new callbacks on a second init.

Makes sense about the multiple issues if you want to go that route. Want me to open a new one?

sowens-csd commented 3 years ago

Interesting, that scenario worked for me. Not all that surprising though, pretty sure the last fix only worked because of a side effect in the Apple SDK and that could easily vary by OS release.

Try this last push and let me know if it changes the behaviour. It's a bit of a hammer because it does a reset of the audio engine before starting the listen. That certainly should clear anything that is hanging on to a channel.

sowens-csd commented 3 years ago

If this doesn't resolve it then I think we should open a new issue and I'll do a release of 2.3.0 with the previous fix. Others are waiting for the release and I don't want to hold it much longer.

sowens-csd commented 3 years ago

2.3.0 is out now with the previous version. If you're okay with it let's close this issue and create a new one for the Skype interactions since they're not well described by the title of this issue.

tamoyal commented 3 years ago

Yep, I'll test on 2.3.0 and then open a new issue. Thanks again!