google / fonts

Font files available from Google Fonts, and a public issue tracker for all things Google Fonts
https://fonts.google.com
18.24k stars 2.62k forks source link

Many instances of Noto Sans have misaligned combining marks for `Aa with Caron` #6920

Open eliheuer opened 1 year ago

eliheuer commented 1 year ago

See this search query and the screenshot below: https://fonts.google.com/?preview.text=%C7%8D%C7%8E&query=Noto+sans

Screenshot 2023-10-26 at 9 24 07 PM

This issue came up in this PR (https://github.com/google/fonts/pull/6766), but looking into it further I notices many fonts with this problem and decided to make a general issue.

simoncozens commented 1 year ago

Ack, you're right. The problem comes from the fact that acaron is not in GF Latin Core, but a is (of course) and so is caroncomb, so the glyph can be formed by Unicode composition. a and caroncomb do have anchors, but something's obviously going wrong in the mark attachment.

simoncozens commented 1 year ago

The situation seems to be due to the fact that UFO provides two ways to define glyph categories - you can use public.opentype.categories in lib.plist, or you can write a table GDEF block in the features.fea file.

1) There is a Noto font with a Designspace source that defines its own GDEF table explicitly in the features.fea file 2) We merge in a new UFO for the Latin subset, but this defines its GDEF categories in the lib.plist, not in the features.fea file. (If we did have two explicit GDEF tables in features.fea, this would cause ufomerger a different headache.) 3) The explicit GDEF wins, so the new Latin marks are not considered marks unless they happen to exist in the original font. 4) The mark feature writer looks at the explicit GDEF, decides that "caroncomb" is not a mark, so doesn't emit mark attachment rules for it.

I think the fix is (a) watch out for this situation in ufomerger, and emit a loud warning, (b) retool Designspace-based Noto sources to use public.opentype.categories.

RosaWagner commented 1 year ago

I guess there are 3 subjects here to take into consideration when talking about mark attachment:

  1. these Noto which primary script are not LCG are not expected to have proper mark attachment. They must cover Kernel as a minimal. It is nice that they are supporting Latin Core, and if mark attachment is working even better, but it is really not an issue that mark attachment is broken (even more for codepoints they don't cover). It would be great if the issue in ufomerger describes by Simon gets fixed, but it is something we can definitely ignore for the non-LCG Noto fonts (until the next wave of upgrades).

  2. For Noto Sans, Noto Serif and Noto Sans Mono; we do expect proper mark attachment and complete Latin support. We noticed yesterday thanks to the new fontbakery check that it was not handled consistently within the file, but not to the point of breaking mark attachment. The codepoints are here, the mark attachment is working, anchors are everywhere. The only issue is that legacy accents are sometimes used in composites glyphs and combining marks are sometimes components of legacy accents (so they get replaced by the legacy during export when being un-nested). Non of that is very bad, it is just a bit inconsistent, we can ignore until it gets cleaned up and fixed (for Noto only) because it gonna take time to fix (it's a very big and complicated file).

  3. The subsetter issue described in multiple issues (eg. https://github.com/google/fonts/issues/6542) — which doesn't seem to be at played here though.

cc @emmamarichal @m4rc1e @vv-monsalve @chrissimpkins so everyone get the info.