jamsch / expo-speech-recognition

Speech Recognition for React Native Expo projects
MIT License
128 stars 11 forks source link

How does this compare to react-native-voice? #35

Closed TowhidKashem closed 1 month ago

TowhidKashem commented 1 month ago

I'm wondering if it's worth switching, react native voice also supports speech recognition but one shortcoming is that you can't get audio levels as you speak which would come in handy if you wanted to show a visualization that changes with the speech (common in pattern).

Another shortcoming of the other lib is that it doesn't detect when speech ends so you have to use a timer that keeps resetting itself each time speech is detected, and if it isn't for a few seconds you can consider it a proper end to speech and process any tasks like sending the speech to a server. Does this lib have some built in support for a "onSpeechEnd" like prop so you don't have to implement it yourself?

Does this lib support that and is it more performant?

jamsch commented 1 month ago

Hey @TowhidKashem, in regards to audio levels -- It's not implemented in this library either, however I think it's quite simple to implement and should be in the library soon until I figure out what it's API should be.

In regards to the detection of speech ending -- this library has a continous mode setting (to match the Web Speech API spec). If you don't enable that setting when starting, the speech recognizer will automatically stop after around 1-3 seconds of no speech. For iOS, this is set to 3 seconds, although I don't yet have an API to configure this. For Android, you can configure the androidIntent options when you call start():

TowhidKashem commented 1 month ago

Hey @TowhidKashem, in regards to audio levels -- It's not implemented in this library either, however I think it's quite simple to implement and should be in the library soon until I figure out what it's API should be.

In regards to the detection of speech ending -- this library has a continous mode setting (to match the Web Speech API spec). If you don't enable that setting when starting, the speech recognizer will automatically stop after around 1-3 seconds of no speech. For iOS, this is set to 3 seconds, although I don't yet have an API to configure this. For Android, you can configure the androidIntent options when you call start():

  • EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS
  • EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS
  • EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS

Awesome that sounds great, thank you! Really looking forward to the sound levels during speech, there's literally no other lib that supports this (I looked lol). The closest is an audio recording lib that give you the sound levels during recording but not when speaking without recording or when audio is being played from the other end (such as during a simulated phone call experience). So if your lib offers that it would be big deal!