does the YT tool help in creating timestamps?

pyrotank41 commented 4 months ago

timestamps are important metadata to be generated, so I assume you might have considered it. I checked out the --help for yt but didnt find a way to do it, is it a part of the library or planed for future dev?

disler commented 3 months ago

Timestamps as in Chapters? Definitely worth adding - I'll add a v1 version soon.

joshyorko commented 3 months ago

You can utilize Whisper's built-in SRT writing capability, you can use the following Python code. This example demonstrates loading a Whisper model, transcribing an audio file, and then using the get_writer function from whisper.utils to write the transcription results in SRT format to a specified output directory.

from whisper.utils import get_writer 

def generate_transcript(audio_file):
    audio = audio_file
    model = whisper.load_model("small.en")  # Adjust the model size as needed
    result = model.transcribe(audio=audio, language='en', word_timestamps=True, task="transcribe")

    vtt_writer = get_writer(output_format='srt', output_dir='./output')
    vtt_writer(result, audio)

Below is an example output of the SRT writer showing the transcribed text with timestamps and sequence numbers:

1
00:00:00,000 --> 00:00:04,520
Dive into the freezing world of cuteness with these five heartwarming penguin facts.

2
00:00:05,860 --> 00:00:09,340
First, penguins propose with pebbles, truly nature's romantics.

3
00:00:10,120 --> 00:00:12,780
Second, they have knees inside their bodies, hidden by feathers.

4
00:00:13,800 --> 00:00:16,880
Third, penguins can drink seawater thanks to a special gland.

5
00:00:17,920 --> 00:00:20,640
Fourth, baby penguins are called chicks or fluffballs.

6
00:00:21,540 --> 00:00:25,640
Fifth, some species form lifelong partnerships, proving love is universal.

7
00:00:25,640 --> 00:00:32,560
Intrigued by these adorable facts? Smash that like button, share, and subscribe for more delightful discoveries.

disler / indydevtools

does the YT tool help in creating timestamps? #2