Closed Cate9021 closed 1 year ago
Yup! That should be doable! I'll be releasing a big update in the next few days and will add this in :)
Any update on this? I didn't see the option available (but then again, I'm a noob and my "Instructions" page is broken due to some operator error, lol). I'd love to be able to export SRT captions straight from this awesome app!
I was just looking for the exactly same feature and as it seems not yet included I just added it to the file transcribe.py
as follows (just elaborating on the answer by Cate9021 so that it works with the current version)
on the top of transcribe.py
add the following two imports
import os from whisper.utils import write_srt,write_vtt,write_txt
and in the method def transcribe
add
self.text = self.raw_output["text"] self.language = self.raw_output["language"] self.segments = self.raw_output["segments"]
## -----------new code start
transcript_basename = self.name + '__' + whisper_model
# save TXT
with open(os.path.join(self.save_dir, transcript_basename + ".txt"), "w", encoding="utf-8") as txt:
write_txt(self.segments, file=txt)
# save VTT
with open(os.path.join(self.save_dir, transcript_basename + ".vtt"), "w", encoding="utf-8") as vtt:
write_vtt(self.segments, file=vtt)
# save SRT
with open(os.path.join(self.save_dir, transcript_basename + ".srt"), "w", encoding="utf-8") as srt:
write_srt(self.segments, file=srt)
## -----------new code end
# Remove token ids from the output
for segment in self.segments:
del segment["tokens"]
This creates the transcript files in the folder local/[Audio/Video Name]
when you are transcribing an audio/video. The variable transcript_basename
sets the file name which you might want to change according to your preferences (i added the model name as i am evaluating the different models right now)
Apologies for the delay. This is now implemented. Each audio file is saved locally and transcriptions on all formats are saved to the same directory (#13)
Thanks for develop this cool whisper UI project, would like to ask if possible to output the result as .txt, .vtt and .srt ?
It's already in whisper/transcribe.py:
` for audio_path in args.pop("audio"): result = transcribe(model, audio_path, temperature=temperature, **args)