[Ready] Whisper large triton support

k2-fsa / sherpa

Speech-to-text server framework with next-gen Kaldi

https://k2-fsa.github.io/sherpa

Apache License 2.0

534 stars 107 forks source link

Closed yuekaizhang closed 1 year ago

yuekaizhang commented 1 year ago

Support whisper via onnx fp16 using triton.

Some perf results attached here:

Decoding on a single V100 GPU, audios are padding to 30s, using aishell1 test set files

Model	Backend	Concurrency	RTF
Large-v2	ONNX FP16	4	0.14

yuekaizhang commented 1 year ago

@csukuangfj Would you mind checking this PR when you are free, many thanks!

yuekaizhang commented 1 year ago

Thanks! Left some minor comments.

Thanks, done!