from faster_whisper import WhisperModel, BatchedInferencePipeline
model = WhisperModel("medium", device="cuda", compute_type="float16")
batched_model = BatchedInferencePipeline(model=model)
Batched version improves the speed upto 10-12x compared to openAI implementation and 3-4x compared to the sequential faster_whisper version. It works by transcribing semantically meaningful audio chunks as batches leading to faster inference.
Will reatimeSTT support the batched version?
Thanks!
from faster_whisper import WhisperModel, BatchedInferencePipeline
model = WhisperModel("medium", device="cuda", compute_type="float16") batched_model = BatchedInferencePipeline(model=model)
Batched version improves the speed upto 10-12x compared to openAI implementation and 3-4x compared to the sequential faster_whisper version. It works by transcribing semantically meaningful audio chunks as batches leading to faster inference.
Will reatimeSTT support the batched version? Thanks!