Closed hongnk closed 4 years ago
Wingdings is a non-standard font, and in order for such fonts to render (and copy) correctly they need to be embedded in the PDF file.
For reference: When opening an issue, please make sure that you provide all of the information requested in https://github.com/mozilla/pdf.js/blob/master/.github/ISSUE_TEMPLATE.md
@Snuffleupagus Thanks I'm aware of wingdings font issue. But here i discovered that it is the character that is wrong, not about the font display.
In another test by opening that file in other applications, where wingdings font is not supported, the character is shown as ò [correct character] (in Drawboard PDF program), but it is shown as Ú [incorrect] is Firefox (which is based on pdf.js)
[Updated with more screenshots] Drawboard PDF/Chrome browser/Adobe Reader: rendered as ò
Firefox/pdf.js render as Ú
How about first embedding the font, and then looking again whether text extraction works?
@THausherr Unfortunately I am unable to get the source file to try that (the file was original but deleted other contents to leave only the symbol). I tried to create a new file with the same wingdings symbol in MS Word and exported to pdf, but pdf.js displays the symbol correctly even without font embedding. So still wonder why other pdf viewers can read the symbol correctly for this particular file, but not pdf.js?
I found the cause in file pdf.js-dist/lib/core/evaluator.js line 1577:
Font 'WIngdings-Regular' is a symbolic font and it is assigned encoding.MacRomanEncoding by default. while the correct encoding should be WInAnsiEncoding
So character ò (charcode 242) becomes Ú (U acute, charcode 218)
Any chance this will get fixed?
Thanks for your update. I tested and noticed the character has changed as it is now using ZapDIngbats font, although it is still showing an incorrect symbol.
I'm not sure why the decision to map WIngdings to ZapDIngbats. Why can't just leave it as native encoding, so eiher it appears as garbage (raw ascii codes), or it displays correct symbol if the font exists (on Windows). I believe that the way native pdf viewers (such as Acrobat reader) displays, and it is satisfaction for users.
This file has a character (symbol) that pdf.js renders wrongly, while most pdf readers get it correctly: error2.pdf
Configuration:
Web browser and its version: chrome browser latest version Operating system and its version: windows 10 latest version PDF.js version: online Is a browser extension: no Steps to reproduce the problem:
correct character: ò pdf.js renders as: Ú (these are wingding symbols shown in below screenshot, but the actual character is as copied and pasted into a text editor)
What is the expected behavior? (add screenshot)
What went wrong? (add screenshot)