notofonts / arabic

Noto Arabic
SIL Open Font License 1.1
15 stars 2 forks source link

Alef height #243

Closed MAZ06 closed 1 month ago

MAZ06 commented 1 month ago

Font

NotoNaskhArabic[wght].ttf NotoSansArabic[wdth,wght].ttf NotoKufiArabic[wght].ttf

Where the font came from, and when

https://github.com/notofonts/arabic/releases

Font Version

Noto Naskh Arabic 2.019 Noto Sans Arabic 2.0.12 Noto Kufi Arabic 2.109

OS name and version

Windows 11

Application name and version

Everything

Issue

Typing آ (U+0622) results in the alef becoming shorter, but typing ا (U+0627) + ۤ (U+06E4) keeps the alef the same height.

This happens in both the Sans and Kufi style, but not in the Naskh style. It also does not affect the final form ﺂ (U+FE82).

But it does affect ﺁ (U+FE81).

EDIT: After looking at fonts.google.com, it seems to be a mixed bag as to how different Arabic fonts handle this issue. Many fonts don't even render alef+madda at all! I have no idea what the correct way is supposed to be.

Character data

آ (U+0622) ا (U+0627) ۤ (U+06E4) ﺂ (U+FE82) ﺂ (U+FE82) ﺁ (U+FE81)

Screenshot

image

MAZ06 commented 1 month ago

The alef becomes smaller for إ (U_0625) and ا (U+627) + ٕ (U+0655) as well, except in Naskh and Kufi. Both إ (U_0625) and ا (U+627) + ٕ (U+0655) stay tall. image

The alef becomes smaller for أ (U+623) and ا (U+627) + ٔ (U+0654) in both Sans and Kufi. Naskh remains the same height again. image

khaledhosny commented 1 month ago

Typing آ (U+0622) results in the alef becoming shorter, but typing ا (U+0627) + ۤ (U+06E4) keeps the alef the same height.

These are not canonically equivalent sequences are are not expected to render the same. U+0622 decomposes to U+0627 + U+0653 not U+06E4.

The other issue is a bug and should be fixed with #246.

MAZ06 commented 1 month ago

You are correct about U+0622! But why do 0653 and 06E4 look exactly the same? Shouldn't they be different sizes?

khaledhosny commented 1 month ago

I don’t believe so. U+06E4 is badly named in Unicode. It does not have to be smaller. It is used in Quranic text to indicate vowel prolongation in recitation, while U+0622 does not get used (and by extension U+0653 because of the canonical decomposition). The exact shape and size have no rule in its function and they indeed usually look exactly the same.

MAZ06 commented 1 month ago

What's the point of having a small madda if it looks exactly the same as a normal madda anyway?

khaledhosny commented 1 month ago

It does not compose with U+0627.

MAZ06 commented 1 month ago

You mean the alef doesn't get smaller? Well, at least that looks different.

khaledhosny commented 1 month ago

When you type U+0627 + U+06E4 they remain two code points, which allows for handling the position of U+06E4 in special way (in classical fonts that handle the subtleties of Quranic text), while U+0627 + U+0653 they can get canonically composed into U+0622 which prevents doing such an effect. The semantics are different, too (U+0653 is a form of hamza, while U+06E4 is vowel mark).

Anyway, that is why I (and others) use U+06E4 for Quranic vowel prolongation mark. As for why Unicode chose to encode both and not unify them, I have no idea, you will have to ask them. I have never seen a small madda, so I only guessed they incorrectly named it.

khaledhosny commented 1 month ago

I think there is nothing actionable left here and this issue should be closed.