papandreou / subset-font

Create a subset of a TrueType/OpenType/WOFF/WOFF2 font using the wasm build of harfbuzz/hb-subset
BSD 3-Clause "New" or "Revised" License
85 stars 6 forks source link

Options to configure what to retain in the name table of a font? #7

Closed bjrn closed 3 years ago

bjrn commented 3 years ago

Hi, this might be more related to harfbuzzjs, but since the options would likely need to be set in subset-font I reckoned I start here … I'm looking for a way to retain the Font License in the subsetted woff2 files.

Glyphhanger which uses pyftsubset under the hood, has a config section regarding what to retain in the name table of a subsetted font: (repo)

...

Font naming options:
  These options control what is retained in the 'name' table. For numerical
  codes, see: http://www.microsoft.com/typography/otspec/name.htm
  --name-IDs[+|-]=<nameID>[,<nameID>...]
      Specify (=), add to (+=) or exclude from (-=) the set of 'name' table
      entry nameIDs that will be preserved. By default only nameID 1 (Family)
      and nameID 2 (Style) are preserved. Use '*' to keep all entries.
      Examples:
        --name-IDs+=0,4,6
            * Also keep Copyright, Full name and PostScript name entry.
        --name-IDs=''
            * Drop all 'name' table entries.
        --name-IDs='*'
            * keep all 'name' table entries

My use case, retaining the font license information, would mean adding a --name-IDs=13,14 flag according to the docs.

ID Description
13 License Description; description of how the font may be legally used, or different example scenarios for licensed use. This field should be written in plain language, not legalese.
14 License Info URL; URL where additional licensing information can be found.

I verified that parts of the name table, like "Copyright", is retained by dropping the file generated by subset-font onto wakamaifondue. The license info is not retained, and as far as I can tell, it's not enabled by default in pyftsubset/glyphhanger either – one has to specify the flag.

papandreou commented 3 years ago

Totally makes sense to expose options for this here, but yeah, someone will need to dive into harfbuzz(js) to figure out how to do it πŸ˜… . Happy to help with whatever I can, but I've always had to seek the assistance of @ebraminio when diving down there.

ebraminio commented 3 years ago

The related API is exposed in harfbuzzjs's subset build, hb_subset_input_nameid_set, hopefully works if isn't disabled by HB_TINY which if so HB_NO_NAME should be put in https://github.com/harfbuzz/harfbuzzjs/blob/main/subset/config-override.h and I should make a new release for it, @papandreou please have a look at these and confirm that, thanks! :)

papandreou commented 3 years ago

Amazing! @bjrn, if you have a working wasm build tool chain you can maybe play around with it? πŸ€—

Otherwise I can try to help later.

papandreou commented 3 years ago

I've taken a quick look at it, and being a good boy I'm starting by trying to write some tests. So the first order of business is to find a module or tool that can list the name ids that are present in a font represented by a buffer.

Looks like fontkit doesn't expose an easy way to get the parsed name table.

There's good old ttx:

$ ttx -t name -o - testdata/OpenSans.ttf 
Dumping "testdata/OpenSans.ttf" to "-"...
Dumping 'name' table...
<?xml version="1.0" encoding="UTF-8"?>
<ttFont sfntVersion="\x00\x01\x00\x00" ttLibVersion="4.24">
  <name>
    <namerecord nameID="0" platformID="3" platEncID="1" langID="0x409">
      Digitized data copyright Β© 2010-2011, Google Corporation.
    </namerecord>
    <namerecord nameID="1" platformID="3" platEncID="1" langID="0x409">
      Open Sans
    </namerecord>
    <namerecord nameID="2" platformID="3" platEncID="1" langID="0x409">
      Regular
    </namerecord>
    <namerecord nameID="3" platformID="3" platEncID="1" langID="0x409">
      1.10;1ASC;OpenSans-Regular
    </namerecord>
    <namerecord nameID="4" platformID="3" platEncID="1" langID="0x409">
      Open Sans Regular
    </namerecord>
    <namerecord nameID="5" platformID="3" platEncID="1" langID="0x409">
      Version 1.10
    </namerecord>
    <namerecord nameID="6" platformID="3" platEncID="1" langID="0x409">
      OpenSans-Regular
    </namerecord>
    <namerecord nameID="14" platformID="3" platEncID="1" langID="0x409">
      http://www.apache.org/licenses/LICENSE-2.0
    </namerecord>
  </name>
</ttFont>

... but it'd be kinda annoying to have a dev dependency on Python and fonttools just to gain access to that tool in the test suite πŸ˜•

I guess we can't use harfbuzz itself for this?

papandreou commented 3 years ago

It does seem to work fine, though: https://github.com/papandreou/subset-font/compare/feature/preserveNameId

@ebraminio, all good, hb_subset_input_nameid_set is present in the latest released harfbuzzjs 😌

bjrn commented 3 years ago

wow, nice work! Agree about having to add a dev dependency on Python, and did a little digging: wakamaifondue is using LibFont to get metadata out of the font: nameTable parsing and the variables that maps name table ID to readable names

It also appears as if FontKit has support for more names than it is exposing: code for name table extraction

I did a quick check to see if it was possible to get the meta data from FontKit somehow, and it seems like this could work:

const font = fontkit.openSync(file);
console.log(font.name.records);

FontKit exposes some default properties like font.fullNume, but it appears as if font.name.records exposes the full set (although each record returns an object with language as key).

Running it on a a .ttf version of Noto Sans Bold yields the following:

{
  copyright: { en: 'Copyright 2012 Google Inc. All Rights Reserved.' },
  fontFamily: { en: 'Noto Sans' },
  fontSubfamily: { en: 'Bold' },
  uniqueSubfamily: { en: 'Monotype Imaging - Noto Sans Bold' },
  fullName: { en: 'Noto Sans Bold' },
  version: { en: 'Version 1.04' },
  postscriptName: { en: 'NotoSans-Bold' },
  trademark: {
    en: 'Noto is a trademark of Google Inc. and may be registered in certain jurisdictions.'
  },
  manufacturer: { en: 'Monotype Imaging Inc.' },
  designer: { en: 'Monotype Design team' },
  description: { en: 'Designed by Monotype design team' },
  vendorURL: { en: 'http://code.google.com/p/noto/' },
  designerURL: {
    en: 'http://www.monotypeimaging.com/ProductsServices/TypeDesignerShowcase'
  },
  license: { en: 'Licensed under the Apache License, Version 2.0' },
  licenseURL: { en: 'http://www.apache.org/licenses/LICENSE-2.0' }
}

I also verified that FontKit font.name.records works with a .woff2 version that I subsetted with pyftsubset, including the --name-IDs=13,14 flag.

So I guess this would mean that FontKit could be used in the tests instead of ttx β€” I won't have time during the day, but if you want I can take a stab at testing it on your branch tonight?

papandreou commented 3 years ago

Ah, nice, I fiddled around with fontkit, but didn't realize that it could be used like that. Updated the branch now.

papandreou commented 3 years ago

Released the support for preserveNameIds in 1.3.0 just now. We can add more detailed controls later if need be :)