SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
8.58k stars 895 forks source link

Automatic timing like Youtube? #2718

Closed tuanbs closed 6 months ago

tuanbs commented 6 years ago

Hello there, I'd like to know if SE has the feature "Automatic timing" like Youtube: If we provide a transcript of words spoken in the video, can SE automatically match them to when they are spoken? Thank you.

LeonCheung commented 6 years ago

I think it matches the last item in my old wish list, too, which can be an extremely difficult one I bet. :) #1968

gabriellluz commented 6 years ago

There's actually an open source engine for that but it doesn't generate subtitles, it only grabs text and not all languages can be recognized.

gabriellluz commented 6 years ago

Found it: CMUS Sphinx https://cmusphinx.github.io/wiki/download/

"Basically", SE would have to transcribe supported languages (the so called "models") by CMU Sphinx and then generate the subtitles.

Available models, as of 01/29/2018: Greek, Indian English, German, French, Dutch, US English, Spanish, Italian, Mandarin, Hindi, Kazakh and Russian.

niksedk commented 6 years ago

PocketSphinx looks interesting - has anyone tried it? How do you do cmd line recognition from a .wav file?

chenlung commented 6 years ago

YouTube's automatic transcription is a nice feature (saves time and effort, especially if you are resource-limited), but like its subtitle support section (tips, instructions, etc), I feel it could easily be better, and for seemingly such a large company they are taking a long time. I wish they would deploy some heuristics (typical conventions/guidelines), such as starting a new sentence on a new line (would be great to have here), as it makes for better absorption for the viewer (even if no-one has verified them).

gabriellluz commented 6 years ago

I really like YouTube's method and there's even a tool on GitHub to actually perform auto subtitling (with timing) but this is probably against ther ToS.

I'll test PocketSphinx this weekend (I'll have to compile it) and I'll tell you how it works but basically it grabs an audio file (probably wav) and gives text but I don't think there's timing.

gabriellluz commented 6 years ago

Wow. Found this tool: https://github.com/saurabhshri/CCAligner

Seems to align subtitles with audio. Check it out, guys. I only use my personal computer at weekends, I'm at work now.

gabriellluz commented 6 years ago

Also, if you really want to use YouTube's feature in order to grab any video file (mkv, avi, mp4) and create automatic subtitles using YouTube's engine, here you go: https://github.com/agermanidis/autosub

I use it since a long time. You'll get the exactly the same subtitles if you upload your video to YouTube. Only difference is that this tool makes it 300% quicker. It works for Windows, Linux and Mac. I've tested it.

niksedk commented 6 years ago

I've tried to add raw pocketsphinx import... not sure if it will work for you, but please try: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.5.5/SubtitleEditBetaPocketSphinx.zip You need to set a path to ffmpeg in options -> settings -> waveform first... The menu item should be in: image

Oh, and English only.

LeonCheung commented 6 years ago

Oh that's really a great feature! It works as tested. However, the accuracy seems not so comparable with y2b so far, which I believe is something about PocketSphinx instead of SE?

MaximPro commented 6 years ago

Great Update.

Also for the the Youtube Method. Please finally add this as well to the next Update:

https://github.com/SubtitleEdit/subtitleedit/issues/2150

To quickly correct youtubes errors, this could be implemented as well. Wich basically shows, where the auto-recognition failed and shows those lines in grey, for you to correct them quickly.

gabriellluz commented 6 years ago

OH MY GOD. This is the f*****g best subtitle editor IN THE WORLD.

HassanAlgoz commented 4 years ago

You can using auditok command line tool for automatic timing it would be nice if this is integrated into the project. Link: https://github.com/amsehili/auditok