googlefonts / ufo2ft

A bridge from UFOs to FontTools objects (and therefore, OTFs and TTFs).
MIT License
152 stars 43 forks source link

Auto-generated kern feature could have wrong languagesystems #112

Open brawer opened 7 years ago

brawer commented 7 years ago

Does anyone understand why the current ufo2ft code runs a regexp over features.fea to find the language systems for the auto-generated features? The current logic seems a little surprising, and it might not work reliably.

Instead of the current logic, why doesn’t the code consider the Unicode codepoints being kerned? It would be tempting to just re-write this code, but I’d like to understand the rationale behind the current logic. Maybe there’s a reason for it?

khaledhosny commented 7 years ago

The language systems in the GSUB or GPOS features need to match otherwise the feature will not be applied for the missing ones. E.g. if the font has:

languagesystem arab dflt;                                                       
languagesystem arab ARA;                                                        
languagesystem arab URD;

But the kern feature has only:

script arab;
language dflt;

Then the kern feature will not be applied to either ARA or URD languages.

twardoch commented 7 years ago

MakeOTF uses the "languagesystem" statements at the beginning of a FEA file as a list of all script and languagesystem tags into which each feature will be registered if it does not explicitly use "script" and "language" statements.

In pre-2.0 AFDKO, it was possible to supply "script" and "language" statements only, for every feature in FEA, and omit the initial "languagesystem" list entirely, but today MakeOTF will complain (with a warning or error, don't remember).

The common practice is that the initial "languagesystem" list lists all scripts and langsyses that are used in the GSUB or GPOS tables of the font.

With MakeOTF, I don't think it's possible to provide only a few scripts or langsyses there, and then "unexpectedly" introduce other scripts or langsyses later. So the assumption is that the "languagesystem" list is complete.

If there is no FEA at all, only kerning.plist, then the kern feature generator is expected to generate a list of languagesystem statements that will include OT script tags for ALL Unicode scripts to which the font's kerned glyphs belong.

twardoch commented 7 years ago

A good implementation of a kern feature writer would recreate the list of languagesystems by creating a union of:

  1. Script tags (with "dflt" langsys tag) that result from the font's Unicode codepoints coverage for the kerned glyphs
  2. Script and languagesystem tags that are already specified in features.fea if present, as they may include non-dflt langsys tags that are used e.g. in "locl".

So if I have code in features.fea that does Turkish ligature pre-handling in "locl" and have in my features.fea:

languagesystem latn dflt; languagesystem latn TRK;

and the computed Unicode coverage of kerning.plist shoes that the font has kerned glyphs from the "latn" and "cyrl" scripts, then after adding the "kern" feature, the initial list should be:

languagesystem latn dflt; languagesystem latn TRK; languagesystem cyrl dflt;

possibly with

languagesystem DFLT dflt;

also prepended. The "kern" feature must be registered in all those langsyses explicitly, including the "latn TRK" one, or else kerning won't work there.