fonttools / fontbakery

🧁 A font quality assurance tool for everyone
https://fontbakery.readthedocs.io
Apache License 2.0
549 stars 100 forks source link

Checks for dottedCircle (U+25CC)? #3600

Closed madig closed 8 months ago

madig commented 2 years ago

Random thoughts

A secret project recently made me think of the dottedCircle (U+25CC) glyph. A quick search on the OpenType specification shows some protentially clarifying results somewhere, saying that Window's USE uses it for displaying defective clusters: https://docs.microsoft.com/en-us/typography/script-development/use#defective-clusters. I was wondering if anyone thought of some checklist to make sure this glyph works correctly for all the scripts that seem to make use of it, like Lao and Khmer? Would this be something one can automate in fontbakery? One particular issue is that in a UFO/fontmake-based workflow, the anchors in your dottedCircle have to match what you need for all the scripts you support, i.e. if you're missing a bottomright anchor, the circle will work only for marks that use other available anchor bases. Not sure if this needs a per-codepoint list of anchors and where they should attach?

Also, anyone know of any other uses?

felipesanches commented 2 years ago

I currently lack any knowledge on this topic. I would appreciate if others (perhaps @simoncozens, @davelab6, @tphinney, or anyone else who may know about this stuff) could give some feedback here to @madig's question.

tphinney commented 2 years ago

Checking that it is “correct” is a big ask, but we could at least get close!

simoncozens commented 2 years ago

I was going to say something similar to Thomas. For point two, the "(list of glyphs)" might just be all mark glyphs; that's the simplest. And "(list of writing systems)" is all Indic scripts, all USE scripts, Khmer, Myanmar, and Hangul (because tone marks).

I wouldn't bother with point three. If you have all the anchors, you're probably going to put them in the right places, and if the end-user is seeing a dotted circle everything's gone wrong anyway.

felipesanches commented 2 years ago

"all Indic scripts, all USE scripts, Khmer, Myanmar, and Hangul (because tone marks)."

We'll need a fontbakery @condition that detects these.

felipesanches commented 2 years ago

if the font supports one or more of (list of writing systems), check for anchors present in (list of glyphs). All those anchors should ALSO be present in dotted circle, FAIL if not

I can start prototyping a check for this that will run only on fonts that have dottedCircle.

Later we can improve it to check for the other things.

I'll give it the straightforward check-id com.google.fonts/check/dotted_circle.

felipesanches commented 2 years ago

And I'll place it on the universal profile.

felipesanches commented 2 years ago

do we have sample font files for this?

simoncozens commented 2 years ago

This will be very helpful; many Noto issues are about missing U+25CC. FWIW, ufo2ft has lists of INDIC_SCRIPTS and USE_SCRIPTS.

bobh0303 commented 2 years ago

And "(list of writing systems)" is all Indic scripts, all USE scripts, Khmer, Myanmar, and Hangul (because tone marks).

Since U+25CC is often used with marks for descriptive/pedagogical purposes (e.g. in The Unicode Standard itself), shouldn't "(list of writing systems)" be all scripts that encode combining marks?

simoncozens commented 2 years ago

And another one: https://github.com/googlefonts/noto-fonts/issues/2248

bobh0303 commented 2 years ago

Since U+25CC is often used with marks for descriptive/pedagogical purposes (e.g. in The Unicode Standard itself), shouldn't "(list of writing systems)" be all scripts that encode combining marks?

Or, perhaps more practical: skip the whole notion testing for specific scripts / writing systems and, instead, require 25CC to be present and have appropriate anchors for any font that includes combining marks.

So, for example, a pure Latin-1 font wouldn't require anchors on 25CC, but a Latin font that includes marks (e.g., things from 0300 block) would.

simoncozens commented 2 years ago

There's some justification for this. And it makes things easier. But I'm not convinced that the 25CC glyph itself should be required for all scripts. Could we agree on:

?

bobh0303 commented 2 years ago

But I'm not convinced that the 25CC glyph itself should be required for all scripts. Could we agree on:

  • 25CC must be present for scripts where the shaper is going to try inserting one.
  • If it is present, all marks should be able to attach to it.

Question: Are there scripts where shapers might insert 25CC in contexts unrelated to combining marks?

At any rate, we (SIL) currently require 25CC in any of our fonts that include combining marks but at this point we don't have a way to test whether appropriate anchors are present, so your test requirements as stated will be helpful to us.

simoncozens commented 2 years ago

Question: Are there scripts where shapers might insert 25CC in contexts unrelated to combining marks?

Nope. I went through the Harfbuzz source when compiling the list above.

so your test requirements as stated will be helpful to us.

Excellent. And I think it would work for Noto too.

simoncozens commented 2 years ago

@felipesanches Do you want to / are you working on this? I'm happy to implement it.

khaledhosny commented 2 years ago

HarfBuzz will insert U+25CC for any combining mark at the beginning of text (provided that HB_BUFFER_FLAG_BOT is set). So I’d be inclined to simply make it required everywhere. It is also provides a nice way to show standalone combining marks which only works if the marks and the circle come from the same font.

felipesanches commented 2 years ago

@felipesanches Do you want to / are you working on this? I'm happy to implement it.

feel free to do it ;-)

felipesanches commented 2 years ago

The basic initial check was implemented by @simoncozens and review/merged by me now.

Please open a followup issue for any of the additional behavior that may be proposed for this check.

tphinney commented 2 years ago

Had an interesting discussion about this today that included @RosaWagner @vv-monsalve @felipesanches @m4rc1e ... I believe the agreement was as follows:

1) For fonts that meet the criteria of needing the dotted circle because they do one or more of the following:

Then Fontbakery should FAIL if EITHER (a) dotted circle is not present, OR (b) if it is present but the needed combining diacritics for those particular scripts do not attach to the dotted circle

2) If a font does not require dotted circle because of (1) requirements, but has one anyway, then Fontbakery should check that all combining diacritics attach to the dotted circle. (This will catch situations like IPA fonts that have a dotted circle.) If they do not, then WARN

3) Perhaps the IPA character set should include a requirement for dotted circle; but this question is independent of the above.

4) Perhaps Fontmake could have an option to automatically add a dotted circle for fonts that meet (1) above, but do not have one.

simoncozens commented 2 years ago

OK. For reference, the current check does:

If there is no dotted circle, FAIL if the font is a complex shaper font else WARN. If there is a dotted circle, FAIL if there are unattached marks.

You want:

If the font is a complex shaper font, FAIL if there is no dotted circle or if there are unattached marks. If not, WARN if there are unattached marks.

So all the code is there and it is just a matter of shuffling the if conditions around. :-)

Perhaps Fontmake could have an option to automatically add a dotted circle for fonts that meet (1) above, but do not have one.

Yeah done. :-) https://github.com/googlefonts/ufo2ft/pull/593

aaronbell commented 1 year ago

I'm a bit confused why Hangeul was included with this check as a FAIL as modern Hangeul use does not include any diacritics nor complex shaping.

simoncozens commented 1 year ago

Hangul technically does use complex shaping, ~but you're right that it doesn't need a dotted circle~. And the Hangul complex shaper does insert dotted circles.

aaronbell commented 1 year ago

Old Hangeul uses complex shaping and has diacritic marks that require dotted circles. Contemporary Hangeul does not.

In the complex shaper file you sent, it specifically mentions ljmo, vjmo, and tjmo, which are not included in modern Hangeul fonts.

I think the check would be more precise if it made sure that it is looking at an Old Hangeul font versus a modern one.