Kytsuine / whisper-llama-subgen

Project to use WhisperX to automatically transcribe subtitles, then revise the subtitles using LLaMa to predict misheard words
GNU Affero General Public License v3.0
0 stars 0 forks source link

Access and parse confidence scores and alternate transcriptions from WhisperX output #1

Open Kytsuine opened 1 year ago

Kytsuine commented 1 year ago

Going off of https://github.com/openai/whisper/discussions/284, it seems to be possible to alter Whisper such that confidence scores are exposed. If this is done, we could greatly reduce the amount of places we need to call LLaMa, reducing the script's execution time. By only sending phrases with low-scored tokens to LLaMa, we'll be able to refine only things likely to need refining. Alternate transcriptions are discussed in https://github.com/openai/whisper/discussions/478.