google / woff2

MIT License
1.52k stars 188 forks source link

Is there a way to specify the whole "more than one woff2 for each range"? #142

Closed Pomax closed 3 years ago

Pomax commented 3 years ago

The power of WOFF2 is for a large part in being able to use multiple sources based on the same font for separate unicode ranges (essential for CJK webfonts), how would one specify that for this tool? And if that's something it can already do, can that information be added to the README.md?

rsheeter commented 3 years ago

I'm not quite sure I follow? - this tool just takes a font file and produces the woff2 version.

Pomax commented 3 years ago

Right, but when woff2 got introduced, the other thing that got introduced was the unicode-range CSS propery for @font-face, so you can turn a single (huge) opentype font into several woff2 sources, each representing a subset of the font's unicode range that only get loaded in when needed and not before, so you could load (for instance) a Chinese font piecemeal, nstead of forcing people to download a single 12MB font before they can even see your content.

So as a google-made-tool, I would expect that this has a way to say how many/which unicode ranges you need the opentype font split up into, so that even though you start with one opentype font, the result is multiple woff2 files to be used in a @font-face ruleset.

jfkthame commented 3 years ago

The unicode-range descriptor considerably predates WOFF2, if my memory serves correctly. You can split a font into several pieces using unicode-range regardless of what format you're using to deliver those sub-fonts.

rsheeter commented 3 years ago

unicode-range was specified but not widely implemented long before woff2. Typically you would cut up your font and then woff2 the pieces. In theory https://rsheeter.github.io/font101/ provides all the information necessary to do this.

I think that means nothing to do for this repo so closing.

Pomax commented 3 years ago

Except there would be high value for the official google woff2 tool to offer splitting by specifying unicode ranges on the command line. Have you tried looking for other tools that let you achieve this? Because there flat out are none =)

@davelab6 any chance to align this with the webfont effort?

vlevantovsky commented 3 years ago

In order to accomplish what you desire one needs to combine font subsetter and woff2 encoder into a single tool ... which is exactly what PFE is doing. The only difference is that in PFE scenario the character set / codepoints included in a woff2 subset is specified by a client, and the tool is capable of producing incremental subsets. However, the same tool could be used to create multiple woff2 subsets where the character sets are defined by unicode ranges.

rsheeter commented 3 years ago

The tool proposed is perfectly reasonable but to put such a tool in this repo would be to change the scope we aim to provide here. The intent is that the tools here are useful to either a user agent or to someone building scripts or tools to produce compressed fonts.

I believe the only thing missing for your use case is something that takes input in unicode-range form and pushes it into your subsetter of choice. IMO leading choices are:

  1. FontTools. If you install the brotli package, it knows how to both subset fonts and write woff2 files.
  2. hb-subset to cut up the font, the cli encoder in this repo to make the pieces into woff2 files.
Pomax commented 3 years ago

hd-subset would technically work, but would require additional scripting to first generate the text files with unicode points before you can run it, which is... less than ideal. That said, maybe hs-subset can be ammended to allow for a --unicode-range argument instead of this tool (https://github.com/harfbuzz/harfbuzz/issues/3038); then you're right, the combination of the two tools would be only marginally inconvenient, and not enough so to not constitute a good solution that can be taught to others when they ask about using woff2 for CJK languages.

Given how much more consideration CJK needs, it might also be worth updating the README.md with a short "For CJK fonts, you will want to ..." so that those folks don't just come away from the woff2 experience thinking the only option for CJK webfonts is single, impossibly huge, font files.

(And with that use case in mind, I'd argue that add a range runtime flag wouldn't really be out of scope, it'd simply be part of the conversion process for non-latin users whose language span huge code blocks)

anthrotype commented 3 years ago

the fonttools subset script has a --unicodes option that takes unicode ranges as well as comma separated unicode values (as hex digits): https://fonttools.readthedocs.io/en/latest/subset/

maybe hb-subset could/should have the same, if not already