ratoaq2 / pgsrip

Rip your PGS subtitles
MIT License
47 stars 8 forks source link

Problems with stylized PGS subs on some files #72

Open outlyer opened 7 months ago

outlyer commented 7 months ago

I've had very good results with most of the files I've converted, but I have noticed the OCR seems to be particularly bad in some situations. I've narrowed this down to the specific way that the PGS subtitles are styled and how they are processed.

With a particularly styled file, the OCR is really inaccurate. I used the --keep-temp-files to see the files, and it looks like the text is inverted and placed on a black background, but the way these particular subtitles are formatted, they show up as a mostly black file.

Here is a normal file:

english srt-1539-psm6-NEURAL-65

and here is an example of the issue:

Blue Collar example

The second example has a border around the font which seems to be the cause of the issues.

ratoaq2 commented 4 months ago

Subtitles could have so many different styles. I can't think of a solution that fits all, inclusive this case