Creating a threshold option to convert all colours to either full-black or full-white?

justyn commented 9 years ago

I've found VobSub2SRT to be a great tool, thank you.

However with the subtitle streams I'm using there are a lot of repetitive OCR mistakes that seem to be caused by the grey pixels outlining each of the (off-white) characters.

You can see an example image here: https://www.dropbox.com/s/dkbd9fh6hfr26fa/stvoy2x23-en-513-orig.png?dl=0

I've found that if I modify the palette line in the idx file to change the grey colours to black, leaving only one lighter colour, I get fantastically better results.

The images from the modified idx look like this: https://www.dropbox.com/s/p08mk4204717dnx/stvoy2x23-en-513-bw.png?dl=0

I think that in the modified version it is easier to distinguish individual characters.

I'm guessing that this could be a common problem, so I was thinking of a really simple option to add to VobSubSRT that would perform the step automatically.

The simplest way I can think of would be to specify a threshold value as a parameter, and to convert every colour in the palette above the value to white and below it to black, ie --bw-threshold 200.

1) Does this sound sensible to you? 2) At what point in the code would this make the most sense?

ruediger commented 9 years ago

Sounds like a good idea. I think it would be best to implement this in the actual idx/vobsub loading code. Maybe it is enough to manipulate the palette field of the vobsub_t struct after loading the subtitles. But this is all hidden in the code imported from mplayer (https://github.com/ruediger/VobSub2SRT/blob/master/mplayer/vobsub.c#L595).

ruediger commented 9 years ago

Closed in #43

ruediger / VobSub2SRT

Creating a threshold option to convert all colours to either full-black or full-white? #41