Feature Request: Whisper Tensorrt-llm backend support

yuekaizhang commented 11 months ago

Hi WhisperX Team, I was wondering if you consider support https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/whisper the tensorrt-llm backend of whisper. I have done several benchmark test using https://huggingface.co/datasets/hf-internal-testing/librispeech_asr_dummy and large-v3 model. Attached the results below:

V100 GPU	faster-whisper	TRT-LLM
batch size 1	38 secs Decoding Time, 2.74% Word Error Rate	22 secs Decoding Time, 2.40% Word Error Rate
batch size 4	Not supported batch decoding, may try whisperX	15 secs Decoding Time, 2.40% Word Error Rate

shashikg commented 10 months ago

Dropping this link: https://github.com/shashikg/WhisperS2T/releases/tag/v1.3.0 here if anyone else interested in whisper's TensorRT-LLM integration with a speech-to-text pipeline.

@yuekaizhang ^^

haiderasad commented 1 month ago

@shashikg @yuekaizhang any tensorrt llm integration for transcription + diarization

m-bain / whisperX