SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
8.45k stars 893 forks source link

Altenative way to generate Time Codes ! #8547

Closed ihanulusoy closed 2 months ago

ihanulusoy commented 3 months ago

Hi..

because whisper timecoding isnt perfect. can we combine plain text with whisper time code!?

for example , this is my plain text sentence ( hello first title, and second title , and third title )

keep in mind (for example) that auto timecode option and whisper audio to text , provide me sub like this :

00:01:10 - 00:01:20 hello first title 00:01:20 - 00:01:30 hello second title 00:01:30 - 00:01:40 hello third title

so , aftert generating audio to text with whisper, can SE understand my sentence (that i provide in plain text with 1 sentence in 1 line option), to combine whisper timecode and understand that, my sentence timecode should be start from 00:01:10 to stop on 00:01:40

please help and ask me more explanation if you need, because I'm not sure that I explained it correctly, like what I have in my mind

darnn commented 3 months ago

Yes, you should indeed explain more thoroughly. What exactly does your input look like? Is it a text file? An srt file with accurate timing? An srt file with blank timecodes? I understand you're combining that with an srt file generated by Whisper, but what do you expect the result to be? What kind of file and what should be in it?

My only guess is that you have an accurate transcript without timecodes, so for example a txt file that looks like this: Hello first title. Hello second title. Hello third title.

And then a transcript generated by Whisper, which has timecodes but is itself less accurate, so something like this:

1 00:00:00,000 --> 00:00:01,000 Hallow first title.

2 00:00:01,024 --> 00:00:02,024 Hello second tight tale.

3 00:00:02,048 --> 00:00:03,048 Hell oat herd title.

If this is the case, what you need is this: https://github.com/EtienneAb3d/WhisperTimeSync/blob/main/distrib/WhisperTimeSync.jar Download that file from there. Instructions on how to use it are here: https://github.com/EtienneAb3d/WhisperTimeSync/ Note that you only need the Synchronize step, but you should read through them all to understand what the other parts are.

ihanulusoy commented 3 months ago

OK, sory for my poor english. i want to translate some movies and tv series and hardest part of translations is, if your main srt sentences are not a full centences, your translation result isnt good in some languages. so i can extract and edit srt files in plain text very fast to put each sentences in single line with some apps. as you can see in screenshot that ill send you, in whisper sub generation i have not full sentences in a line that very important to translate correctly. even i tried whisper advanced (sentences) but result is same. in this way , i have to watch whole videos to split or merge each text to previous or next and its take a long time.

i mean, in fact after whisper generation your softwares knows "german industry is now so short of aluminium that they start to" time code is what, and also knows "salvaging it from crashed american bombers" timecode is what!

so after whisper auto generation, i want to give edited plain texts to SE to generate timecode. and when SE in my plain texts comes on a line with this content "german industry is now so short of aluminium that they start to salvaging it from crashed american bombers" its should understand to set start/stop points in timecode like this: 00:30:42.590 - 00:30:49.970

Screenshot 2024-06-18 at 15 36 52
darnn commented 3 months ago

Ah, I think I understand now. Sadly, I don't know of a way to do this in SE (or at all). What I do in these situations is: I take my timed subtitle file and create the text file that's made up of complete sentences (you can do this using File->Export->Plain text). I translate that, and then import the text into SE again (using File->Import->Plain text). During the import process, you can choose how long to make your lines and so on, so as to match the general length of the original file with the timecodes. Then I open the original file again, and use Tools->Make new empty translation from current subtitle. Now I copy the contents of the imported translated file, and paste them into the empty translation file using Column paste. You can use Column paste either by setting up a keyboard shortcut: image Or by right-clicking the first subtitle, then Column and then Paste from clipboard. At this point you'll still have to go and correct things by hand, but it's easier than any other method I know of. I recommend setting up keyboard shortcuts for the following: List view->Column, delete text and shift up List view->Column, insert text Text box->Move last word to next subtitle Text box->Fetch first word from next subtitle Text box->Move last word from first line down (current subtitle) Text box->Move first word from next line up (current subtitle) And learning to use them.

ihanulusoy commented 3 months ago

thank you for you help. but its still need manual work to fix. if you find a auto way please share it here. actually its should be easy for SE developers to add this feature

JDTR75 commented 3 months ago

Have you tried Purfview's Whisper version? It has a --sentence option that should do what you're needing.

ihanulusoy commented 3 months ago

Have you tried Purfview's Whisper version? It has a --sentence option that should do what you're needing.

yes, but result is not perfect