Open fralapo opened 1 day ago
The raw youtube transcript has chunks with timestamps. I suggest you could use those to align the punctuated versions (result of OpenAI) with the original chunks. That's the approach I'm using in https://www.appblit.com/scribe
Thanks, I will try this method. However, I encountered a problem while attempting to transcribe a 30-minute video using the OpenAI transcription API. The process fails due to exceeding the maximum content size limit, with the following error message:
413: Maximum content size limit (26214400) exceeded (26265614 bytes read)
I am currently working on improving video transcriptions using the OpenAI API and have successfully integrated a solution that enhances transcription accuracy. However, I believe that extending the functionality to include subtitle generation would be extremely beneficial.
Suggested Enhancement:
Potential Implementation:
Automatic Segmentation: Once the transcription is corrected, a subtitle generation feature could automatically segment the text based on pauses or logical sentence boundaries. Each segment should be assigned a start and end timestamp to align with the spoken dialogue.
Formatting Options: Provide options for exporting the transcription in various subtitle formats, such as:
Use Case:
The generated subtitles would be helpful for content creators looking to add captions to their videos without requiring additional manual editing. This could significantly reduce the effort involved in captioning, making videos more accessible and enhancing SEO.
Why This Feature Matters:
This feature could leverage existing transcription models with additional logic to generate time-synced captions, perhaps by integrating models that can perform audio segmentation and alignment.
Thank you for considering this suggestion!