NVIDIA-AI-IOT / whisper_trt

A project that optimizes Whisper for low latency inference using NVIDIA TensorRT
Other
25 stars 4 forks source link

Add faster_whisper_trt #3

Closed yuvraj108c closed 1 month ago

yuvraj108c commented 1 month ago

Hi, there is an ~ 4x faster implementation of whisper (https://github.com/SYSTRAN/faster-whisper)

Is it possible to optimise this using tensorrt, similarly to this repository?

jaybdub commented 1 month ago

Hi @yuvraj108c ,

Thanks for reaching out!

AFAIK Faster whisper get's it's speed up by using CTranslate, which is it's own runtime separate from TensorRT. The benefits couldn't be stacked.

However, according to our profiling WhisperTRT is already 30% faster than FasterWhisper on Jetson Orin Nano. You can find this and other benchmarks in the README.

Hope this helps, let me know if you have any other questions.

Best, John

yuvraj108c commented 1 month ago

Thanks, I'll check it out!