w3c / alreq

Documenting gaps and requirements for support of Arabic and Persian on the Web and in eBooks.
Other
60 stars 31 forks source link

Hijri calendar abbreviation for Urdu #229

Open r12a opened 4 years ago

r12a commented 4 years ago

The Unicode Standard says:

These connecting forms commonly occur in some abbreviations such as the marker for hijri dates, which consists of an initial form of heh: 

The glyph at the end isn't based on a Unicode code point, but looks like an initial form of ه [U+0647 ARABIC LETTER HEH] (with a ZWJ or tatweel to apply the initial form), which makes sense in Arabic, since it represents the word هجری.

However, ARABIC LETTER HEH isn't used in Urdu, and what examples i've seen of hijri dates appear to use ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE] instead. See some examples.

So i have 2 questions:

  1. Is the abbreviation for hijri dates indeed HEH DOACHASHMEE in Urdu?
  2. If so, is a ZWJ necessary, since the glyph for the isolated form looks pretty much the same at the initial for for that letter?
khaledhosny commented 4 years ago

HEH DOACHASHMEE often takes a different form than Arabic initial Heh, so I think it is not the right Unicode character. The closest Unicode charter I think would be U+1EE24 ARABIC MATHEMATICAL INITIAL HEH (𞸤) as the same glyph in metal type days would be used for places where that shape of isolated heh is needed e.g. in Abjad numerals, in abbreviations like ا.𞸤. (short of انتهى) and so on. The same with other letters like Alef, Jeem, and Dal, the form used in Abjad numbers is the same as the form used in math.

Unfortunately, Arabic math symbols are a relatively a new addition to Unicode and most fonts lack them (and are unlikely to be widely supported being specialized symbols). What users usually type is Heh followed by tatweel, which is the closest approximation readily available on Arabic keyboard layouts.

But checking a few old books, I see an initial Heh without the tail being also used (i.e. the initial form), which can be achieved with ZWNJ as well.

jfkthame commented 4 years ago

HEH DOACHASHMEE often takes a different form than Arabic initial Heh

I'm not sure what "different forms" you're referring to here? The examples I've seen of an abbreviation for Hijri on Urdu calendars look very much like typical HEH DOACHASHMEE to me, and I would have assumed they're simply U+06BE. Or possibly <U+06BE, U+200D> if we think it should be an initial form, though it can be hard to tell in hand-written calligraphy whether that was intended.

r12a commented 4 years ago

+1 to @jfkthame's comment. And btw the spelling of the word ہجری in Urdu is different from the Arabic wherever i've seen it, since it uses ہ U+06C1 ARABIC LETTER HEH GOAL at the start, rather than ه U+0647 ARABIC LETTER HEH.

It was actually because ہ U+06C1 ARABIC LETTER HEH GOAL is different from ھ U+06BE ARABIC LETTER HEH DOACHASHMEE that i wanted to double check my understanding. In the past those two characters were used interchangeably, whereas now the latter is generally reserved for aspirated plosives. But, like @jfkthame, everywhere i have seen the hijri date abbreviation so far, including in Wikipedia text, it actually looks like doachashmee and in electronic data uses that code point.

r12a commented 4 years ago

I pointed to this issue from the Unicore list, and received emails from 4 people that ھ U+06BE ARABIC LETTER HEH DOACHASHMEE seems to be the appropriate letter.

khaledhosny commented 4 years ago

I was talking specifically about Arabic not Urdu, so my comment does not really apply to the issue being asked. But here is a comparison between the 3 characters in Amiri and Noto Nastaliq Urdu the form of HEH DOACHASHMEE is clearly different from HEH (other fonts might use different forms and might not make such distinction). Noto Nastaliq Urdu does not have the MATHEMATICAL INITIAL HEH as expected, but that is the form I’d find more appropriate for Hijri in Arabic (judging by hand-written calligraphy):

image