Open pyrotank41 opened 4 months ago
Timestamps as in Chapters? Definitely worth adding - I'll add a v1 version soon.
You can utilize Whisper's built-in SRT writing capability, you can use the following Python code. This example demonstrates loading a Whisper model, transcribing an audio file, and then using the get_writer
function from whisper.utils
to write the transcription results in SRT format to a specified output directory.
from whisper.utils import get_writer
def generate_transcript(audio_file):
audio = audio_file
model = whisper.load_model("small.en") # Adjust the model size as needed
result = model.transcribe(audio=audio, language='en', word_timestamps=True, task="transcribe")
vtt_writer = get_writer(output_format='srt', output_dir='./output')
vtt_writer(result, audio)
Below is an example output of the SRT writer showing the transcribed text with timestamps and sequence numbers:
1
00:00:00,000 --> 00:00:04,520
Dive into the freezing world of cuteness with these five heartwarming penguin facts.
2
00:00:05,860 --> 00:00:09,340
First, penguins propose with pebbles, truly nature's romantics.
3
00:00:10,120 --> 00:00:12,780
Second, they have knees inside their bodies, hidden by feathers.
4
00:00:13,800 --> 00:00:16,880
Third, penguins can drink seawater thanks to a special gland.
5
00:00:17,920 --> 00:00:20,640
Fourth, baby penguins are called chicks or fluffballs.
6
00:00:21,540 --> 00:00:25,640
Fifth, some species form lifelong partnerships, proving love is universal.
7
00:00:25,640 --> 00:00:32,560
Intrigued by these adorable facts? Smash that like button, share, and subscribe for more delightful discoveries.
timestamps are important metadata to be generated, so I assume you might have considered it. I checked out the --help for yt but didnt find a way to do it, is it a part of the library or planed for future dev?