Closed mardab closed 2 years ago
check file encoding
for some languages default format is most often txt
sounds like there are no timestamps in txt files?
converting txt to srt is non-trivial, this requires speech-recognition
probably low-quality offline speech-recognition (deepspeech or STT) will work as you have both audio and text
similar https://github.com/abhirooptalasila/AutoSub
Support for post-processing
let me add: remove ads from subtitles. for example ...
find . -name '*.srt' -print0 | xargs -0 grep -F -h -e .com -e www. | sort | uniq
contact www.OpenSubtitles.org today
contact www.OpenSubtitles.org today
Downloaded From www.AllSubs.org
Download Movie Subtitles Searcher from www.OpenSubtitles.org
FilthyRichFutures.com
Find out @ saveanilluminati.com
Find out @ saveanilluminati.com
<font color="#00ff00"> www.addic7ed.com</font>
-- <font color="#138CE9">www.addic7ed.com</font> --
-- <font color="#138CE9">www.Addic7ed.com</font> --
... <font color="#138CE9">www.Addic7ed.com</font> ...
<font color="#ffff00" size=14>www.moviesubtitles.org</font>
<font color="#ffff00" size=14>www.opensubtitles.org</font>
<font color="#ffff00" size=14>www.tvsubtitles.net</font>
<font color=green>EMail - parminder222536@hotmail.com
Please rate this subtitle at www.osdb.link/xxxx
Preuzeto sa www.titlovi.com
Subtitle by Luis-Subs From subscene.com
to remove all ads from www.OpenSubtitles.org
to remove all ads from www.OpenSubtitles.org
To visit alt.lawndale.com
Trading can. FilthyRichFutures.com
WhoisPaulHaggis.com.
wodurch sämtliche Werbung von www.OpenSubtitles.org entfernt wird
- www.addic7ed.com -
www.addic7ed.com
-- www.Addic7ed.com --
www.addic7ed.com</font>
www.DeeJayAhmed.com
www. forom. com
www.forom.com
-== [ www.OpenSubtitles.com ] ==-
-== [ www.OpenSubtitles.com ] ==-
-= www.OpenSubtitles.org =-
-== [ www.OpenSubtitles.org ] ==-
-== [ www.OpenSubtitles.org ] ==-
www.OpenSubtitles.org
www.OpenSubtitles.org adresinden tüm reklamları kaldırmak için bizi destekleyin ve VIP üye olun.
www. outpost-daria. com
www.outpost-daria.com
www.RegieLive.ro
www.titlovi.com
www.whoisTomDevocht.com.
Without such basic feature as file encoding this tool is useless for any other subtitle language than English...
@RDKRACZ please test #29
I have tried multiple other solutions and this is the best one at this moment. My only gripe is that for some languages default format is most often txt (which some players acknowledge, but never auto-load) with pre-utf8 encoding (which requires separate tool to correct it) and right now, unlike competitors, subdl has no built-in option to correct these problems.
Also, being able to check file encoding could also help with (automatic) subtitle selection, since newer, more likely to be better subtitles don't use anything else than unicode.
If possible, I'd like to contribute at least a preliminary support for post-processing, but before I do that I'd like to know how should I attempt it.