foliojs / fontkit

An advanced font engine for Node and the browser
1.46k stars 219 forks source link

Fix getting the fd index for OTF/CFF CID fonts #207

Closed blikblum closed 4 years ago

blikblum commented 4 years ago

There's a logic error in the binary search that looks for fd index in CFFFont#fdForGlyph leading to erroneous result when gid is exactly equal to range.first.

To see the bug load a CID font like NotoSansCJKkr-Regular.otf in https://fontkit-demo.now.sh/ and try to draw character ":" (gid 27)

Its possible to optimize the binary search a bit but will let this for later, just get the minimal change to fix the bug

Harbs commented 4 years ago

I think this fix is functionally equivalent to pull #168.

You can probably either use that fix, or close the PR. FWIW, I think my fix is slightly more efficient (bails slightly earlier).

blikblum commented 4 years ago

You can probably either use that fix, or close the PR. FWIW, I think my fix is slightly more efficient (bails slightly earlier).

Yes. I used the minimal change needed approach to fix it. One advantage of this PR is that contain tests but i agree that your fix is (probably) better performance wise.

I suggest you to rebase your PR on top of current master and run the tests to ensure all clear

ghislain-sc commented 4 years ago

I tried implementing this fix into the standalone version of PDF kit, the problem still appears with Noto Sans SC and Chinese punctuation marks.

The glyphs.js doesn't seem to be included in the standalone version, do I need to include it?

blikblum commented 4 years ago

I tried implementing this fix into the standalone version of PDF kit, the problem still appears with Noto Sans SC and Chinese punctuation marks.

Can you try https://www.npmjs.com/package/pdfkit-next ? It comes with this fix

ghislain-sc commented 4 years ago

I looked into that earlier today, but I wasn't able to compile the next package into a standalone version. (I tried locally on Mac).

I'll try again.

blikblum commented 4 years ago

Try https://unpkg.com/pdfkit-next@0.10.0/js/pdfkit.standalone.js

ghislain-sc commented 4 years ago

I just tried, using the mac terminal to install "pdfkit-next", then uploading the "pdfkit.standalone.js" located in the "pdfkit-next" folder. But no improvements.

ghislain-sc commented 4 years ago

Try https://unpkg.com/pdfkit-next@0.10.0/js/pdfkit.standalone.js

Thank you. Just tried it and no improvements.

blikblum commented 4 years ago

What font are you using? What text do you get an error?

blikblum commented 4 years ago

This repl it shows correct output with pdfkit-next: https://repl.it/@blikblum/pdfkit-notosans

When using published pdfkit package, there are errors

ghislain-sc commented 4 years ago

Hi, sorry for the long delay.

To answer your question: I was using Noto Sans SC. I tried all font weights, tried unhinted, hinted, tried converting it to TTF, use OTF. Also tried Adobe Source Han Sans (which is the exact same font) available in OTF and TTF...

The problem always appears when the text include punctuation, wether Chinese or English. The problem might occur before the punctuation or after. When rendering a paragraph without punctuation, there is no problem. Example of punctuation creating problems: ()" ' , ; : 。,:“ ‘ ()「」...

After several attempts I had settled for a half solution, back in January: I used Noto for all Chinese characters and switched to PingFang for Chinese punctuation. In order to do so, I checked each character with .match(/[\x20-\x7E]/) and .match(/[\u3400-\u9FBF]/).

This solution was less than ideal, it meant a lot of processing for every single paragraph. Even the font switch wasn't that noticeable since I used Pingfang only for special characters, it was still disturbing enough. Finally the PDFs generated were fine in the browser and most preview apps, but appeared broken when open with Adobe Acrobat. So we just stopped.

In the end we just gave up the idea of generating PDFs from our website. But today I noticed the comments and so I gave a look at the repl it. Actually the problem still shows with the Noto font PDF. As you can see the second line should read "产品名:", but it shows only "产 :"

The mistake appears far enough to be too noticeable, which is probably why you thought it was fixed. Then I ran the repl it with a complete paragraph, including more special characters. And the problem is still definitely here.

We don't have immediate use for this anymore, but I believe other people might. I'll try to check this thread more often, let me know if you need more details. Thank you.

ghislain-sc commented 4 years ago

Hi, for anyone looking to use this specific font Noto Sans SC / TC with PDF Kit, there is actually a workaround. Google's Noto Sans SC and Adobe's Source Han SC are actually the same font, made by the same designer. And there is a TTF version of Adobe's version here: https://github.com/Pal3love/Source-Han-TrueType

Once using this file, everything works well.