k2-fsa / sherpa

Speech-to-text server framework with next-gen Kaldi
https://k2-fsa.github.io/sherpa
Apache License 2.0
473 stars 97 forks source link

Is there VAD in Triton whisper? #601

Closed evanxqs closed 1 month ago

evanxqs commented 1 month ago

There are hallucinations exist when using of Triton whisper with TensorRT-LLM in sherpa,and a VAD can solve most of hallucinations as my experience, so i wonder is there VAD in Triton whisper that i can enable it directly? or i have to integrate a new one by myself?

yuekaizhang commented 1 month ago

There are hallucinations exist when using of Triton whisper with TensorRT-LLM in sherpa,and a VAD can solve most of hallucinations as my experience, so i wonder is there VAD in Triton whisper that i can enable it directly? or i have to integrate a new one by myself?

You may integrate it a new one by checking this https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/vad-with-non-streaming-asr.py. Welcome to contribute if you do so.

yuekaizhang commented 1 month ago

Or you could do VAD at the client side.

evanxqs commented 1 month ago

There are hallucinations exist when using of Triton whisper with TensorRT-LLM in sherpa,and a VAD can solve most of hallucinations as my experience, so i wonder is there VAD in Triton whisper that i can enable it directly? or i have to integrate a new one by myself?

You may integrate it a new one by checking this https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/vad-with-non-streaming-asr.py. Welcome to contribute if you do so.

silero-vad https://github.com/snakers4/silero-vad

thank you ! and do you know which vad is better?

yuekaizhang commented 1 month ago

There are hallucinations exist when using of Triton whisper with TensorRT-LLM in sherpa,and a VAD can solve most of hallucinations as my experience, so i wonder is there VAD in Triton whisper that i can enable it directly? or i have to integrate a new one by myself?

You may integrate it a new one by checking this https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/vad-with-non-streaming-asr.py. Welcome to contribute if you do so.

silero-vad https://github.com/snakers4/silero-vad

thank you ! and do you know which vad is better?

They are exactly same model. It's up to you.