This pull request contains a quick proof of concept to showcase how easy it is to add a new pipeline. I added the speech-to-text pipeline in this example using openai/whisper-large-v3. The beutifull thing about this pipeline is that it can be cold served since model load times are < 3s. Further it only requires 6.5 GB Vram and therefore can be done on lower VRam cards.
This pull request contains a quick proof of concept to showcase how easy it is to add a new pipeline. I added the
speech-to-text
pipeline in this example using openai/whisper-large-v3. The beutifull thing about this pipeline is that it can be cold served since model load times are < 3s. Further it only requires 6.5 GB Vram and therefore can be done on lower VRam cards.You can test it out using audio samples from https://audio-samples.github.io/ and starting up a pipeline using the Runner documentation. You can then execute the pipeline running on https://localhost:8000/docs.