fonttools / fontbakery

🧁 A font quality assurance tool for everyone
https://fontbakery.readthedocs.io
Apache License 2.0
558 stars 103 forks source link

New check: Check that glyph for U+0675 ARABIC LETTER HIGH HAMZA is not a mark #4290

Open khaledhosny opened 1 year ago

khaledhosny commented 1 year ago

What needs to be checked?

U+0675 ARABIC LETTER HIGH HAMZA should be a base glyph not a mark and should be the same size as U+0621 ARABIC LETTER HAMZA but slightly higher above baseline.

Detailed description of the problem

Many fonts incorrectly treat U+0675 ARABIC LETTER HIGH HAMZA as a variant of U+0654 ARABIC HAMZA ABOVE and makes it a combining mark of the same size. But U+0675 is base letter and should be a variant of U+0621 ARABIC LETTER HAMZA but raised slightly above baseline.

Resources and steps needed to reproduce the problem

The current version of Noto Sans Arabic has this issue and many other fonts.

Suggested profile

Suggested result

Which log result level should the check have:

Severity assessment

4 as this effectively makes it useless for Jawi and possibly Kazakh as well.

simoncozens commented 6 months ago

All this is true, but it should apply to U+0674 (HIGH HAMZAH), not U+0675 (HIGH HAMZAH ALEF)

bobh0303 commented 4 months ago

[U+0674] ARABIC LETTER HIGH HAMZA should be ... the same size as U+0621 ARABIC LETTER HAMZA but slightly higher above baseline.

Where does this information come from? Unicode's current code charts certainly do not suggest this.

khaledhosny commented 4 months ago

[U+0674] ARABIC LETTER HIGH HAMZA should be ... the same size as U+0621 ARABIC LETTER HAMZA but slightly higher above baseline.

Where does this information come from? Unicode's current code charts certainly do not suggest this.

This is based on its use as three quarters hamza in Jawi use and native readers preferences. See https://www.unicode.org/L2/L2022/22051-jawi-hamza.pdf and the Unicode response in https://www.unicode.org/L2/L2022/22068-script-adhoc-rept.pdf. Though the recommendation there is to have a Jawi-specific variant glyph and relay on language tagging, my understanding is that this extra complication is unnecessary as a full size hamza seems to be acceptable for Kazakh, like many of the samples shown here for example https://www.unicode.org/L2/L2020/20289-kazakh-kyrgyz-uyghur-annot.pdf.

bobh0303 commented 3 months ago

Thanks. So the result of Script Ad Hoc two years ago says:

However, the general consensus amongst the group was to unify THREE QUARTER HIGH HAMZA with U+0674 HIGH HAMZA, and to highly encourage font designers to support the glyph with the expected shape and positioning for Jawi, employing a language tag or creating a Jawi-specific font. Encoding a new character would take several years, and text representation would be inconsistent, causing other problems for users.

This suggests to me that the portion of this fontbakery test that confirms the glyph size is similar to U+0621 should apply only to fonts that have such language tag or are known to be designed specifically for Jawi.

In fact one could argue that in all other cases (i.e., no Jawi language tag and font not known to be specific to Jawi) the size check should be against U+0654.

bobh0303 commented 3 months ago

should apply only to fonts that have such language tag

Actually, it is slightly more nuanced than that.

If the font has a Jawi language tag, then fontbakery should test 0674 glyph size with and without the that language applied, and the glyph size should be similar to 0621 in the first case and 0654 in the second.

bobh0303 commented 3 months ago

The soon-to-be-released Unicode 16 has included information about Jawi usage to the discussion of high hamza. The Review Draft of Ch 9 now reflects the Script Ad Hoc recommendation, specifically saying:

Malay Jawi uses U+0674 ٴ ARABIC LETTER HIGH HAMZA. In Jawi, the letter is the same size as U+0621 ء ARABIC LETTER HAMZA; however, unlike U+0621, it is positioned above the baseline at three-quarters height of the U+0627 ا ARABIC LETTER ALEF. Font designers can use language tagging in order to support the preferred shapes for both Kazakh and Jawi in multilingual fonts

With that in view, it seems this test really needs to evaluate the size of high hamza based on langtag-specific rendering within the font. Whether this even possible with facilities within fontbakery I do not know. But it seems clear that the following logic would be needed: