Closed Adsc58 closed 1 year ago
The models are exactly the same yes.
(whisper_timestamped is just doing an import of load_audio
and load_model
functions from whisper
, so they do exactly the same).
To write an SRT file, you can do (if you are using the last version of whisper_timestamped
):
from whisper_timestamped.make_subtitles import write_srt
result = whisper.transcribe_timestamped(...)
write_srt(result["segments"], open("file.srt", "w", encoding = "utf8"))
Well, my previous answer concerning SRT was to write an SRT file for segments. If you want to do it on words (not segments) you can do:
def flatten(list_of_lists, key = None):
for sublist in list_of_lists:
for item in sublist.get(key, []) if key else sublist:
yield item
write_srt(flatten(result["segments"], "words"), open("file.words.srt", "w", encoding = "utf8"))
And note that you can write both SRT files using the CLI:
python whisper_timestamped/transcribe.py file.wav --model tiny --output_dir . --output_format srt
(the files will be named ./file.wav.srt
and ./file.wav.words.srt
)
thank you for the detailed information.
Thank you for the code. I want to read the word-level timestamp results as an SRT format. How can I do that? Are the model files(tiny,large-v2, large etc.) you are using the same as the model files in this original code "https://github.com/openai/whisper"