ruediger / VobSub2SRT

Converts VobSub subtitles (.idx/.srt format) into .srt subtitles.
GNU General Public License v3.0
293 stars 65 forks source link

Add --min-width, --min-height to elide spurious subpictures #48

Closed abrasive closed 9 years ago

abrasive commented 9 years ago

A lot of my DVDs seem to have dodgy subpictures with visible size 8x7 or smaller. These have non-character forms and invariably fail OCR.

This patch introduces a configurable minimum width and height. This replaces the previous mechanism which implicitly limits to 1x1 minimum. The default width is set as 9 (= 2 characters in any legible font) but keeps the height as 1 (to handle things like "..."). You may prefer a narrower default.

ruediger commented 9 years ago

Why not simply let the OCR fail?

abrasive commented 9 years ago

Fixed the commit to report the correct default and to show the values in the warning.

I think it's important to distinguish between the causes of OCR failures: either there's a subpicture that is meaningless (empty/too small), or there's actual text and something really went wrong. The former is nothing, the latter is data loss. Given the utility of VobSub2SRT in scripts (as I'm doing now) to convert large numbers of titles, it's useful to know. And in scripted usage, it's a fantastic tool. Thank you ever so much for publishing & maintaining it!

ruediger commented 9 years ago

Sounds reasonable.