I've had very good results with most of the files I've converted, but I have noticed the OCR seems to be particularly bad in some situations. I've narrowed this down to the specific way that the PGS subtitles are styled and how they are processed.
With a particularly styled file, the OCR is really inaccurate. I used the --keep-temp-files to see the files, and it looks like the text is inverted and placed on a black background, but the way these particular subtitles are formatted, they show up as a mostly black file.
Here is a normal file:
and here is an example of the issue:
The second example has a border around the font which seems to be the cause of the issues.
I've had very good results with most of the files I've converted, but I have noticed the OCR seems to be particularly bad in some situations. I've narrowed this down to the specific way that the PGS subtitles are styled and how they are processed.
With a particularly styled file, the OCR is really inaccurate. I used the
--keep-temp-files
to see the files, and it looks like the text is inverted and placed on a black background, but the way these particular subtitles are formatted, they show up as a mostly black file.Here is a normal file:
and here is an example of the issue:
The second example has a border around the font which seems to be the cause of the issues.