JuergenFleiss / aTrain

A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.
Other
326 stars 19 forks source link

Feature request: Support for large-v3-turbo model #35

Open nhan000 opened 3 days ago

nhan000 commented 3 days ago

the turbo model is an optimized version of large-v3 that offers faster transcription speed with a minimal degradation in accuracy.

openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

large-v3-turbo model by jongwook · Pull Request #2361 · openai/whisper

Thanks a lot for making Whisper so easy to use!

JuergenFleiss commented 3 days ago

Will be included as the default model (specifically the even faster ctranslate version, https://github.com/SYSTRAN/faster-whisper) in the upcoming aTrain 1.2 release, we are in the final testing phase.

You can also use it if you build it yourself from https://github.com/JuergenFleiss/aTrain/tree/v1.2

Transcription times will be reduced by around 30% compared to large-v3

nhan000 commented 2 hours ago

Would you also consider adding support for large-v3 as well?