jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.59k stars 177 forks source link

Parallelized stable-ts Implementation + Live Demo #264

Open mvoodarla opened 11 months ago

mvoodarla commented 11 months ago

Hey folks - I'm sure this is a common idea but we implemented a version of stable-ts that first goes through the entire audio file, splits the audio by silences, and then parallely passes it through stable-ts. Here is the implementation. The only way to benefit from this is if you have multiple machines that are running stable-ts in the cloud or some sort of a queuing mechanism.

I work at Sieve that builds a related infrastructure product that makes such a thing easy to do. Here's a demo you can run with your own data.

It doesn't require the use of Sieve but hopefully is an interesting idea the community can benefit from if you're trying to get every last bit of speed and also have the compute setup to do so!