Closed iwangjian closed 1 year ago
Hi. For real-time processing, ASR must be performed in less than 1 second for a 1-second interval. Real-time processing may be difficult because of the current whisper is slow in general.
Processing time is mainly determined by GPU performance and a model size.
Therefore, specifying a small model like --model tiny
is one way.
Another way is to use VAD, which is lighter than whisper's processing. If the VAD determines that a section is silent, it skips the whisper processing.
Got it, thank you very much!
Hi, appreciate your excellent project! I tried running the server and the client successfully. I found that ASR responds slowly, although I set
--frame
to a smaller value (i.e., 100),--num_block
to 80, and--vad
to 0. Whether is it possible to apply your project for real-time streaming ASR? If possible, may I know how to set the parameters properly? Thank you.