foliojs / fontkit

An advanced font engine for Node and the browser
1.46k stars 219 forks source link

Fix codePointsForGlyph() for values >= 0xffff #338

Open lehni opened 2 months ago

lehni commented 2 months ago

At Lineto we have a typeface where the following code does not retrieve all code-points correctly. Sadly, we cannot share the font here. I tried a few other open-source fonts, but did not encounter the same issue.

  for (const codePoint of font.characterSet) {
    const glyph = font.glyphForCodePoint(codePoint)
    console.log(
      glyph.name,
      codePoint,
      font._cmapProcessor.codePointsForGlyph(glyph.id)[0]
    )
  }

Wit our font, for most glyphs, I am getting correct results, such as:

.null 0 0
uni000D 13 13
space 32 32
exclam 33 33
quotedbl 34 34
numbersign 35 35
dollar 36 36
percent 37 37
ampersand 38 38
quotesingle 39 39
parenleft 40 40
…

But for quite a few, codePointsForGlyph() doesn't produce the inverse of glyphForCodePoint():

uniE001 57345 undefined
uniE002 57346 undefined
uniE100 57600 undefined
uniE101 57601 undefined
uniE102 57602 undefined
uniE103 57603 undefined
uniE104 57604 undefined
uniE105 57605 undefined
uniE106 57606 undefined
uniE107 57607 undefined
uniE108 57608 undefined
uniE109 57609 undefined
uniE10A 57610 undefined
uniE10B 57611 undefined
uniE10C 57612 undefined
uniE10D 57613 undefined
…

This simple PR fixes this bug:

uniE001 57345 57345
uniE002 57346 57346
uniE100 57600 57600
uniE101 57601 57601
uniE102 57602 57602
uniE103 57603 57603
uniE104 57604 57604
uniE105 57605 57605
uniE106 57606 57606
uniE107 57607 57607
uniE108 57608 57608
uniE109 57609 57609
uniE10A 57610 57610
uniE10B 57611 57611
uniE10C 57612 57612
uniE10D 57613 57613
…
fi 64257 64257
fl 64258 64258
uniFEFF 65279 65279
.notdef 65535 65535
blikblum commented 1 month ago

Creating a test case that fails in current code and is fixed with the PR would help

lehni commented 1 month ago

Like I said, I'm only encountering this issue with one font so far, and I cannot share this font for reasons of copyright.

Harbs commented 1 month ago

The change is correct IMO and seems obvious. case 4 in lookup earlier in the file has return gid & 0xffff; which is simply the inverse of this.