ahmetoner / whisper-asr-webservice

OpenAI Whisper ASR Webservice API
https://ahmetoner.github.io/whisper-asr-webservice
MIT License
1.86k stars 332 forks source link

Support for Whisper JAX #175

Open 0xT3chN0 opened 7 months ago

0xT3chN0 commented 7 months ago

Hi, thanks for this Software. It works perfect with Bazarr, for subtitle Translation.

However, it is, even with Faster-Whisper and a Tesla T4 GPU a bit slow. There is a new Whisper implementation, that can Transcribe 1 Hour of Audio in approx. 15 seconds.

The new Whisper implementation is called "Whisper JAX" (https://github.com/sanchit-gandhi/whisper-jax). It has support for CPU, GPU and even TPU, though there is already a big speed gain just by using a GPU.

Is it possible for one to add this whisper implementation to this ASR Webservice, that you can select between OpenAI Whisper, Faster Whisper and Whisper JAX?

Thanks!

Jessomadic commented 7 months ago

This seems very interesting if it's easy to implement

Jessomadic commented 7 months ago

@nicholaskoerfer How is the accuracy of it? It's not worth gaining speed if the subtitles are worse

Jessomadic commented 7 months ago

HuggingFace has a webversion up and holy crap it's fast. 21 min youtube video in about 8 seconds

0xT3chN0 commented 7 months ago

Yes, it's incredibly fast. In my opinion, the subtitles are accurate.

Jessomadic commented 5 months ago

Is this going to be worked on? @ahmetoner

blundercode commented 5 months ago

Hmm, this is pretty interesting great find! Would be cool to have a near-instant service even if the quality is slightly worse.