yasirkula / UnitySpeechToText

A native Unity plugin to convert speech to text on Android & iOS
MIT License
66 stars 7 forks source link

Plugin works on Samsung-based Android Devices but get gets errors on Google Pixel Android Devices #5

Open Aumarka opened 1 week ago

Aumarka commented 1 week ago

Description of the bug

I have been mostly testing a game I have been working on that uses this package using samsung-based android devices. The speech to text system works fine on there, however, on a Google Pixel 6a, the STT system does not work as intended. Users testing the app on Google Pixel 6a report that pressing the speak button works the first time, but every other time after that they have to press the button twice.

The first time they press the speak button it instantly cancels the speech to text, and they either get error 8 or 11 (it alternates between the two). After that, if they click the button a second time the speech to text works as intended. So in summary, when the game runs, the first time they press speak button it works as intended, but every other time after that they have to press it twice, with the first time resulting in either an error code 8 or 11 being thrown.

Screenshot 2024-06-21 150920

Screenshot 2024-06-21 151019

The errors thrown seem to indicate that for some reason the server has disconnected and/or the recognizer from the previous session is still active, however, I am confused as to why this doesn't happen on my samsung-based android devices and for some reason only happens on the pixel devices.

Reproduction steps I unfortunately don't have access to the Google Pixel 6a devices that this error occurred on, as this error was found by users testing the game remotely. Again, I have been testing the game personally on samsung devices and this error has not occurred once on any of them. I will be getting a Google Pixel 6a to do further testing myself to see what the issue is and will update with more findings when I can.

Platform specs

Please provide the following info if this is a Unity 3D repository.

Unity version: 2022.3.21f1 Platform: Android Device: Google Pixel 6a (device where this error has appeared), Samsung Galaxy A21s and Samsung Galaxy S23 (main devices I have been using for testing and where the issue hasn't occurred) How did you download the plugin: GitHub

yasirkula commented 1 week ago

I'm suspecting that SpeechToTextRecognitionListener.onError is invoked immediately but is followed by actual speech data (logically, if an error occurs, then it should be the end of the session).

I've added logs to if( !isResultSent ) conditions' else clauses to see if this is the case. You can update SpeechToText.aar with the one inside this zip archive to get those logs and see if that's indeed the case.

Aumarka commented 1 week ago

Hi, sorry for the delay, only have just been able to get my hands on a Google Pixel 6a. I loaded my game and then played through. First time pressing the speak button it works fine, but then again, the second time I try to press it, I have to press it twice for it to work. Here is the logs for when I press it a second time.

image

Here it is in plain text: 2024/06/27 15:50:31.539 5172 5172 Warn Unity SpeechToText.SendResult is called but the result is already sent (8): 2024/06/27 15:50:31.542 5172 5172 Error Unity Speech recognition error code: 11 2024/06/27 15:50:31.559 5172 5172 Debug Unity SpeechToText.SendResult (11): 2024/06/27 15:50:31.573 5172 5190 Info Unity OnResultReceived: --- Error: 11

Any ideas on why this is happening?

yasirkula commented 1 week ago

I'm guessing there was a "Error Unity Speech recognition error code: 8" log before the "Warn Unity SpeechToText.SendResult is called but the result is already sent (8):" log. It looks like the speech recognizer first sent error code 8 and then sent error code 11. If there are no other logs, then my theory was incorrect, there are no actual speech data after the errors.

I'll try to think of other possible causes. In the meantime, if you get the chance, could you try using other Unity speech recognition plugins on GitHub? Maybe this issue is plugin-independent.

Aumarka commented 6 days ago

Hi, sorry I forgot to update. I think I was able to fix my issue but I'll do some more thorough testing to make sure. It turns out that when I received a result in OnResultReceived, I added a StopTextToSpeech call in there once all the processing was done, which included some extra code I added along with SpeechText.ForceStop. After changing the .ForceStop to .Cancel, the STT functionality now works as intended on both my samsung and pixel devices, and those errors are no longer appearing in logcat.

I'm not sure if that's the best solution, so I'll do more testing over the coming weeks, however, at the very least I know the issue was somewhat being caused by me improperly using the .ForceStop method.

yasirkula commented 6 days ago

Ah, glad to hear it. I'd recommend calling neither ForceStop nor Cancel in OnResultReceived. You could add a parameter to StopTextToSpeech specifying whether the source is OnResultReceived and if it's true, you can skip ForceStop/Cancel.