k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.14k stars 366 forks source link

【flutter】The UI process will stall #1140

Open sunjiaming opened 2 months ago

sunjiaming commented 2 months ago

When I run flutter-examples project, first click start asr, UI process is stuck, how to optimize this piece.

In addition, when I realized the loading model myself, the UI process would also get stuck, and the experience was not good.

while (_recognizer! .isReady(_stream!) ) { _recognizer! .decode(_stream!) ; }

This method also jams the UI

csukuangfj commented 2 months ago

Would you like to add an isolate to load the model and to do the recognition?

sunjiaming commented 2 months ago

@csukuangfj It might work. Suggest the official to give an example

csukuangfj commented 2 months ago

We don't have much experience about flutter and it will take a very long time to add an isolate to do that.

sunjiaming commented 2 months ago

Look forward to official follow-up

BrutalCoding commented 1 month ago

@sunjiaming The "official" follow-up has a lot of todo's, thus I think that this specific issue will be a low priority for them (which is fair IMO).

I will implement this feature sometime this month in sherpa-onnx. Also, @csukuangfj is correct when he referred to isolates. I will have to check which methods are heavy, I think it'll be loading the assets the first time (due to copying from Flutter to native filesystem) and of course any computational heavy methods.

From my experience, it seems like VAD and TTS are fast but it depends on the amount of data you give of course. Using STT with something like Whisper is heavier, which can cause a CPU spike and thus freeze the UI (main thread). Nonethless, I will spend some time on this issue and improve the overall experience.

Here's the official isolates documentation: https://dart.dev/language/isolates

Here's how I use isolates to solve a similar issue in a Flutter plugin I'm still working on: https://github.com/BrutalCoding/aub.ai/blob/b16f7e5f9d317d83b38a5814afb1556c8de8a3d3/lib/aub_ai.dart#L503C1-L504C1

@csukuangfj can you assign this issue to me?

csukuangfj commented 1 month ago

@sunjiaming The "official" follow-up has a lot of todo's, thus I think that this specific issue will be a low priority for them (which is fair IMO).

I will implement this feature sometime this month in sherpa-onnx. Also, @csukuangfj is correct when he referred to isolates. I will have to check which methods are heavy, I think it'll be loading the assets the first time (due to copying from Flutter to native filesystem) and of course any computational heavy methods.

From my experience, it seems like VAD and TTS are fast but it depends on the amount of data you give of course. Using STT with something like Whisper is heavier, which can cause a CPU spike and thus freeze the UI (main thread). Nonethless, I will spend some time on this issue and improve the overall experience.

Here's the official isolates documentation: https://dart.dev/language/isolates

Here's how I use isolates to solve a similar issue in a Flutter plugin I'm still working on: https://github.com/BrutalCoding/aub.ai/blob/b16f7e5f9d317d83b38a5814afb1556c8de8a3d3/lib/aub_ai.dart#L503C1-L504C1

@csukuangfj can you assign this issue to me?

That would be great! Thank you!