mdcollins05 / srt-lang-detect

A tool to detect the language used and rename SRT subtitle files
6 stars 1 forks source link

Fails seemingly randomly "Couldn't open file" #7

Open StaffanBetner opened 2 years ago

StaffanBetner commented 2 years ago

For some files I get the error "Couldn't open file". I guess it has to do with the filename somehow.

Examples that fail: 'Veronica Mars S03E09 Spit & Eggs.srt' 'Veronica Mars S03E10 Show Me the Monkey.srt' 'Zodiac (2007).en.srt' 'Vi är bäst! (2013).sv.srt' 'Spring, Summer, Fall, Winter... and Spring (2003).zh.srt' 'Night Moves (2013).nl.srt'

Working: 'Veronica Mars S03E11 Poughkeepsie, Tramps & Thieves.srt' 'Veronica Mars S03E16 Un-American Graffiti.srt' 'Max Manus Man of War (2008).en.srt' 'Marvel One-Shot- A Funny Thing Happened on the Way to Thor's Hammer (2011).en.srt' 'Lord of War (2005).en.srt'

StaffanBetner commented 2 years ago

I checked the encodings, and saw that they differed between the sets. #5 might solve the problem, but I get some weird confidence values.

Parsing 'Veronica Mars S03E09 Spit & Eggs.srt'...
Filename identified as: Unknown
Subtitles identified as:
English: 100.0%
English: 100.0%
English: 100.0%
English: 100.0%
English: 100.0%
English: 100.0%
English: 100.0%
English: 100.0%
Norwegian: 100.0%
English: 100.0%
English: 100.0%
Serbian: 100.0%
Serbian: 100.0%
Serbian: 100.0%
Serbian: 100.0%
Serbian: 100.0%
Chinese: 100.0%
Greek: 100.0%
Farsi: 100.0%
Greek: 100.0%
Japanese: 100.0%
Chinese: 100.0%
Thai: 100.0%
Greek: 100.0%
Japanese: 100.0%
Greek: 100.0%
Veronica Mars S03E09 Spit & Eggs.en.srt does not exist on disk
Confidence of 100 equal or higher than required value to rename (50)
Would rename 'Veronica Mars S03E09 Spit & Eggs.srt' to 'Veronica Mars S03E09 Spit & Eggs.en.srt'

Encodings for files that fail for the master branch: 'Veronica Mars S03E09 Spit & Eggs.srt': text/plain; charset=iso-8859-1 'Veronica Mars S03E10 Show Me the Monkey.srt': text/plain; charset=iso-8859-1 'Zodiac (2007).en.srt': text/plain; charset=iso-8859-1 'Vi är bäst! (2013).sv.srt': text/plain; charset=iso-8859-1 'Spring, Summer, Fall, Winter... and Spring (2003).zh.srt': text/plain; charset=iso-8859-1 (obviously incorrect though) 'Night Moves (2013).nl.srt': text/plain; charset=iso-8859-1

Working: 'Veronica Mars S03E11 Poughkeepsie, Tramps & Thieves.srt': text/plain; charset=utf-8 'Veronica Mars S03E16 Un-American Graffiti.srt': text/plain; charset=us-ascii 'Max Manus Man of War (2008).en.srt': text/plain; charset=utf-8 'Marvel One-Shot- A Funny Thing Happened on the Way to Thor's Hammer (2011).en.srt': text/plain; charset=utf-8 'Lord of War (2005).en.srt': text/plain; charset=utf-8