notofonts / notobuilder

Python module for building Noto fonts
9 stars 0 forks source link

Noto fonts missing PostScript glyph names #25

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I've noticed for a number of Noto fonts (e.g., Devanagari, Bengali, Javanese, 
Balinese) that they don't have PostScript glyph names in their post tables. 
This means that PDF readers can't reliably reconstruct the original text from 
PDF documents using glyphs from these fonts. Glyphs that are referenced by the 
cmap table can be reconstructed, but glyphs inserted only via the GSUB table 
cannot. This is particularly problematic for Brahmic scripts, where typically a 
large number of glyphs are ligatures and conjuncts that are only inserted via 
the GSUB table.

Adobe has published a Glyph List Specification that describes how glyph names 
are mapped back to Unicode characters – this can be used in reverse as 
guidance for constructing glyph names from the Unicode characters they 
represent.

The fonts should define PostScript names that are usable for the reconstruction 
of the original character sequence from PDF documents.

Original issue reported on code.google.com by googled...@lindenbergsoftware.com on 13 Feb 2015 at 2:53

GoogleCodeExporter commented 9 years ago
In case it helps, here is the official URL for the AGL Specification: 
http://sourceforge.net/adobe/aglfn/wiki/AGL%20Specification/

Original comment by ken.lu...@gmail.com on 13 Feb 2015 at 2:20

GoogleCodeExporter commented 9 years ago

Original comment by stua...@google.com on 28 Feb 2015 at 12:35

brawer commented 8 years ago

Curious, will this be fixed? It shouldn’t be hard to write a little Python script that assigns AGL-compliant glyph names. For example, here is a snippet from the GSUB table of NotoSans-Regular.ttf in TTX format. From this, it can be inferred that the name of glyph02366 should be uni03D0.alt.

<AlternateSubst>
    <AlternateSet glyph="uni03D0">
        <Alternate glyph="glyph02366"/>
   </AlternateSet>
</AlternateSubst>
brawer commented 8 years ago

Here is a code snippet (as a patch for fonttools) that walks the GSUB table, looks for alternates, and uses them to suggest AGL-compliant names for glyphs that have no name yet. https://github.com/brawer/fonttools/commit/7c5769db961a1f34a4853cf1c1ea8ecaf0804502

Example use:

$ python Snippets/assign-agl-glyph-names.py tmp/NotoSans-Regular.ttf 
Should rename glyph02363 to Eng.alt1
Should rename glyph02364 to Eng.alt2
Should rename glyph02365 to Eng.alt3
Should rename glyph02366 to uni03D0.alt

@behdad, what do you think, should I make this a proper tool within the fonttools codebase? It actually would be a great way for me to figure out how to walk GSUB tables, and not that much work; so it would be a nice complement to my current feaLib work.

khaledhosny commented 8 years ago

Smart PDF creators can use various ways to embed the original text in the PDF (cmap, ActualText etc.) and since they have access to the actual text string, they don’t need glyph names.

The only case where glyph names are needed is when the original text is missing during PDF creation, e.g. distilling PDF from PostScript. But this only works for simple scripts and I doubt that glyph names help much with the complex scripts mentioned, especially when there is re-ordering and one to many glyph substitutions.

brawer commented 8 years ago

Even if the impact isn’t huge, would there be any downside to complying with Adobe’s glyph naming conventions?

khaledhosny commented 8 years ago

File size I guess.

roozbehp commented 8 years ago

Adobe's glyph naming conventions are not necessarily nice for understanding what a glyph is. I highly recommend shipping with "source" glyph names, i.e. whatever the font designer prefers to use so they would understand the glyph better. This also helps downstream modifications and contributions by helping people understand a glyph better.

Processes that care about file size can simply strip out the glyph names.

/cc @kenlunde /cc @norbertlindenberg

kenlunde commented 8 years ago

Note that the AGL Specification and the AGL & AGLFN project have been retired from SourceForge, and are now hosted on GitHub at agl-specification and agl-aglfn.

hrhatada commented 7 years ago

Hi Xiangye & Marek, sound a production decision need for this issue.

ghost commented 6 years ago

Seeing as this issue hasn't been picked up for the past few years, shouldn't this be closed?