mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.51k stars 9.98k forks source link

Barcode rendering (charCode missing) #10702

Open Dewep opened 5 years ago

Dewep commented 5 years ago

Attach (recommended) or Link to PDF file here: 10100934495.pdf

Configuration:

Steps to reproduce the problem:

  1. Open PDF with PDF.js (in Firefox or Chrome)
  2. Open PDF in another PDF reader (works with the PDF viewer of Chrome)

Expected behavior:

image

What went wrong?

image

Details:

Some characters from my PDF are not printed on PDF.js (even on the PDF viewer on my firefox), but they are displayed on the PDF viewer of Chrome. These characters are some parts of a barcode, displayed with the Code128 font (which is embedded in the PDF).

After debugging fonts.js (from master), it seems to be the charCodeToGlyphId variable, with the MacRomanEncoding. I managed to fix the bug, but since I'm far from being a pro on glyphs + PDF.js, I'm not sure what impact it can have.

Here are my modifications: https://gist.github.com/Dewep/69ccb252b97909989e264a3413df7ef1/revisions

What do you think of this fix?

Snuffleupagus commented 5 years ago

Open PDF in another PDF reader (works with the PDF viewer of Chrome)

Unfortunately that's not a good indication that the font is well-formed though, since this is how Adobe Reader (i.e. the PDF reference implementation) renders the file for me:

reader

After debugging fonts.js (from master), it seems to be the charCodeToGlyphId variable, with the MacRomanEncoding. I managed to fix the bug, but since I'm far from being a pro on glyphs + PDF.js, I'm not sure what impact it can have.

Here are my modifications: https://gist.github.com/Dewep/69ccb252b97909989e264a3413df7ef1/revisions

Could you please explain your proposed change, from a PDF/font specification perspective, since it honestly looks a bit "magical" as-is? How to generate reference images and run the test-suite locally is described in https://github.com/mozilla/pdf.js/wiki/Contributing; please keep in mind fonts are a very complex topic, and it's unfortunately easy to introduce regressions.

Dewep commented 5 years ago

it honestly looks a bit "magical" as-is

As I implied, it is, I don't know much about it, I debugged the lib until I could figure out how I could add a "hack". And my basic question is therefore whether this hack is a viable one.

But... I thought that it was correct, because:

  1. If I'm using the font on Chrome (with CSS), it works
  2. It sounds weird to me to "replace" the charCode by another one
  3. On the replacement, made just before the hack, there is a comment from 5 years ago, which says "TODO: the encoding needs to be updated with mac os table."

However...

  1. I don't know anything about fonts/TTF/PDF.js core system
  2. You are right to consider Acrobat as the reference

I will try to find another TTF font, hoping that it can work (maybe avoiding that its platform is 1, with an encoding of 0, not to go through this MacRomanEncoding replacement).

Thanks for your quick answer.