sumatrapdfreader / sumatrapdf

SumatraPDF reader
http://www.sumatrapdfreader.org
GNU General Public License v3.0
13.67k stars 1.73k forks source link

Random display (or not) of Polish characters #3259

Open MarekSzzK opened 1 year ago

MarekSzzK commented 1 year ago

Depending on the content of the text, the same characters of the Polish alphabet are correctly decoded / displayed or not

test-1.pdf test-2.pdf test-OK.pdf

GitHubRulesOK commented 1 year ago

Its flawed text that is not correctly embeded complain to the author

image

MarekSzzK commented 1 year ago

Why Adobe Acrobat Reader works fine and shows: PROTOKÓŁ ąśęćółń ĄŚĘĆÓŁŃ aar ?

GitHubRulesOK commented 1 year ago

Posibly the author only wants to work with Adobe and not other apps like 70% of the market e.g. Edge where its worse than SumatraPDF guesses. image image

MarekSzzK commented 1 year ago

I don't understand. Foxit Reader displays correctly, Sumatra v. 3.0 displays correctly too ! Some internet tools, e.g. https://www.i2pdf.com/ displays correctly, Could you write what exactly is wrong, how the correct file should look like? In my Edge works too ! aaaz

GitHubRulesOK commented 1 year ago

Hmm your Edge is Polish so understands badly defined local language characters. That would be why your machine can show Polish and other regions cannot, But part of the issue Will be that current SumatraPDF does not handle all poor / loose font definitions

@kjk reopening as this is clearly a MuPDF regression not able to correct a bad file compared to other viewers

image from log ignoring CMap range (251-251) that is outside of the codespace ignoring CMap range (252-252) that is outside of the codespace ignoring CMap range (224-224) that is outside of the codespace ignoring CMap range (225-225) that is outside of the codespace non-embedded font using identity encoding: Arial