riderodd / react-native-vosk

Speech recognition module for react native using Vosk library
MIT License
36 stars 9 forks source link

Better implementation #38

Closed kingdcreations closed 6 months ago

kingdcreations commented 7 months ago
          @riderodd I am facing a problem where the `onFinalResult` event is never triggered, and I believe this change is the reason since everything is cleaned up on the first result. Does it make sense?

_Originally posted by @joazvsoares in https://github.com/riderodd/react-native-vosk/pull/32#discussion_r1411547254_

kingdcreations commented 7 months ago

Hi @joazvsoares !

From what I've tried, onFinalResult() event is called when you call stop() on your vosk instance while it's still recognizing. onFinalResult() will not trigger if you already got a recognition result (onResult() event) as demonstrated in https://github.com/riderodd/react-native-vosk/blob/test/onFinalResult/example/src/App.tsx

I hope it helps you

kingdcreations commented 7 months ago

@riderodd I cleaned/ linted the code in this PR and added a stop button to test the onFinalResult() event

Maybe we could send a onResult() too instead of a onFinalResult() there: https://github.com/riderodd/react-native-vosk/blob/test/onFinalResult/android/src/main/java/com/vosk/VoskModule.kt

What do you think ?

joazvsoares commented 7 months ago

Hi @kingdcreations!

Let's imagine the following use case from the vox-browser-demo project:

image

When you click on "Speak" the recognition starts, and you can receive onResult events.

This event is called after silence occured but it doesn't mean we want to stop listening/recognizing.

As you can see here, we can still receive events from the same recognizer instance:

    recognizer.on("result", (message: any) => {
      const result: VoskResult = message.result;
      setUtterances((utt: VoskResult[]) => [...utt, result]);
    });

The issue I see with the current implementation is:

The current onResult function is deliberately closing the recognition instance after the first result, thereby removing the ability to continuously receive more results.

The role of the start function can be confusing, as it is atypical for a "start" function to also return results.

My proposal is to define a start function that purely initiates the process without returning results, introduce a separate function responsible for returning the first result if needed, and also make it up to the client to decide when to close the recognition instance.

kingdcreations commented 7 months ago

Hi @joazvsoares,

Yeah it seems to be a better implementation, I'll work on a pull request for a potential v2