chenxwh / cog-whisper

MIT License
81 stars 28 forks source link

No option to get output transcriptions in any popular subtitle format #1

Closed kowalcj0 closed 1 year ago

kowalcj0 commented 1 year ago

Hi,

Thank you for making whisper available. It's a great tool. After reading this article https://simonwillison.net/2022/Sep/30/action-transcription/ I decided to give whisper a go. It worked remarkably well. Although, there's no way to get the output transcription in srt or vtt format.

whisper does support both formats -> https://github.com/openai/whisper/blob/0b1ba3d46ebf7fe6f953acfd8cad62a4f851b49f/whisper/transcribe.py#L306-L312

It'd be great to have it, so the users could generate timed transcriptions (subtitles) for the videos they'd like to watch.

Here's an example scenario describing the use case:

Given I have an audio track file in any language supported by whisper
When I run the model with an argument to generate timed transcription in "(srt|vtt)" format
And I GET the prediction via API
Then I should see the transcription and additional timed transcription in "(srt|vtt)" format

Thanks

zeke commented 1 year ago

cc @kevin01881 @ronyfadel who have asked about this functionality.

@chenxwh may be able to add support for this, but this is an open-source project so any of you should feel free to fork this repo, implement the changes to add support for subtitle output, and open a PR here.

chenxwh commented 1 year ago

The subtitle downloading option was included in my initial implementation, but was suggested by Andreas and Ben not to include in the openai/whisper Demo. I have pushed it to another demo here: https://replicate.com/cjwbw/whisper-downloadable-subtitles

However, .vtt files are not supported by Replicate as a download option therefore not inlcuded

kevin01881 commented 1 year ago

Awesome! Any chance there'll be an update sometime that would timestamp the subtitles?

Op ma 3 okt. 2022 23:02 schreef Chenxi @.***>:

The subtitle downloading option was included in my initial implementation, but was suggested by Andreas and Ben not to include in the openai/whisper Demo. I have pushed it to another demo here: https://replicate.com/cjwbw/whisper-downloadable-subtitles

— Reply to this email directly, view it on GitHub https://github.com/chenxwh/cog-whisper/issues/1#issuecomment-1266044811, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2MEFMI7S4DRUW2L4Q5ZM73WBNCVJANCNFSM6AAAAAAQ2LZPLU . You are receiving this because you were mentioned.Message ID: @.***>

chenxwh commented 1 year ago

Awesome! Any chance there'll be an update sometime that would timestamp the subtitles? Op ma 3 okt. 2022 23:02 schreef Chenxi @.>: The subtitle downloading option was included in my initial implementation, but was suggested by Andreas and Ben not to include in the openai/whisper Demo. I have pushed it to another demo here: https://replicate.com/cjwbw/whisper-downloadable-subtitles — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2MEFMI7S4DRUW2L4Q5ZM73WBNCVJANCNFSM6AAAAAAQ2LZPLU . You are receiving this because you were mentioned.Message ID: @.>

.srt file is already with timestamp?

kevin01881 commented 1 year ago

@chenxwh you're correct, I downloaded the txt instead of the SRT, D'OH! Hahah. Thanks, awesome!!

Op ma 3 okt. 2022 23:06 schreef Chenxi @.***>:

Awesome! Any chance there'll be an update sometime that would timestamp the subtitles? Op ma 3 okt. 2022 23:02 schreef Chenxi @.

>: … <#m-7600736660076738685> The subtitle downloading option was included in my initial implementation, but was suggested by Andreas and Ben not to include in the openai/whisper Demo. I have pushed it to another demo here: https://replicate.com/cjwbw/whisper-downloadable-subtitles https://replicate.com/cjwbw/whisper-downloadable-subtitles — Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/chenxwh/cog-whisper/issues/1#issuecomment-1266044811>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2MEFMI7S4DRUW2L4Q5ZM73WBNCVJANCNFSM6AAAAAAQ2LZPLU https://github.com/notifications/unsubscribe-auth/A2MEFMI7S4DRUW2L4Q5ZM73WBNCVJANCNFSM6AAAAAAQ2LZPLU . You are receiving this because you were mentioned.Message ID: @.>

.srt file is already with timestamp?

— Reply to this email directly, view it on GitHub https://github.com/chenxwh/cog-whisper/issues/1#issuecomment-1266049174, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2MEFMNOWGCNLE5VGFF3JTTWBNDFXANCNFSM6AAAAAAQ2LZPLU . You are receiving this because you were mentioned.Message ID: @.***>

kowalcj0 commented 1 year ago

Thanks @chenxwh for the new demo. I just gave it a go and it worked nicely. Although the timestamps are in incorrect format: 0:02:16.612 --> 0:02:19.376 instead of 00:02:16,612 --> 00:02:19,376 Some media players will simply refuse to use such subtitles.

kowalcj0 commented 1 year ago

Thanks @chenxwh for the new demo. I just gave it a go and it worked nicely. Although the timestamps are in incorrect format: 0:02:16.612 --> 0:02:19.376 instead of 00:02:16,612 --> 00:02:19,376 Some media players will simply refuse to use such subtitles.

It's been fixed in https://github.com/openai/whisper/pull/197 :muscle: :muscle: :muscle: