r12a / scripts

Various pages and tools for working with non-Latin scripts
http://r12a.github.io/doclist
33 stars 14 forks source link

Ye with hamzah above recommendation in Urdu notes is incorrect #117

Closed bgo-eiu closed 1 year ago

bgo-eiu commented 1 year ago

The page recommends the usage of separate / attaching hamzah + ye rather than the dedicated character ئ. (It also has a part that instead says ye + shadda یّ but I assume this was an unintentional mistake, shadda would not go there.)

The reasoning given is flawed because while it is true that Urdu uses ی, the reason it does so is to prevent the word-final position of the letter from having two dots beneath. Hamza is used in combination with vowels in Urdu, but not in the same way where it occurs in Sanskrit-derived Urdu/Hindustani words which are native to the subcontinent region. In nearly all situations where ye and hamzah occur together, it is before another letter. The only time you will see ئ at the end of the word is when that word was loaned directly from Arabic, in which case it would be correct to say that it is meant to be ي + hamzah anyway.

The biggest reason why this recommendation is a problem is that it results in incorrect rendering of this combined letter. Urdu fonts, keyboard layouts, etc. all use ئ, and typically render the hamzah above it differently than it would be rendered for Arabic and differently from how the attaching character would be rendered.

For example, at this URL: https://r12a.github.io/scripts/arabic/block#char0626

image The letter is not legible here.

This page is for the Urdu Dictionary Board's Urdu Lughat, which is maintained by the government of Pakistan as a sort of standard digital lexicon for Urdu. They do not use the separate attaching hamzah in their entries, and if you type both the separated character combination and ئ in the search field, you can see that even when a font is able to render both instances, they do not get the same treatment. http://udb.gov.pk/

image

The left is the correct way to type the common letter combination of ئے with the U+0626 character. The right is the incorrect way with separate hamzah and farsi ye as یٔے. Notice that the font actually adds back the dots at the end when you use farsi ye in this way; despite what the character decomposition might suggest the ئ is the only character than may be used to correctly render this Urdu letter in all positions and letter combinations. When ئ connects with ے in Nastaliq writing of Urdu, the hamzah is actually supposed to change positions and go over the bari ye. We do not get this result if we treat the hamzah as separate. We could maybe say that Urdu has independent farsi ye and uses Arabic ye with hamza, but this doesn't really make matter. It makes more sense just to think of ئ as a different letter.

bgo-eiu commented 1 year ago

Here is another comparison before and after changing to U+0626 in the browser console

Before: image

After: image

When the correct character is used, you can see the actual position differences which are supposed to occur for it. ئ + ی should have no dots underneath either if at the end of a word, but ئ + ی followed by a consonant should result in dots underneath only ی.

r12a commented 1 year ago

Thanks for your detailed notes. I'm currently using up all my spare time with another project, so i wanted you to know that i've seen this and will get back to it as soon as possible, though that may be at least a couple of weeks still. Btw, how should i refer to you in the acknowledgements?

bgo-eiu commented 1 year ago

That's alright, I have been quite busy as well. I ended up looking into precomposed characters recently dealing with problems coming up in Punjabi regular expressions and URLs (in both scripts, Gurmukhi and Shahmukhi the latter of which is essentially a manifestation of Urdu orthography).

You can refer to my username on here, or if you prefer to use a name, Usmaan (‏عثمان‏ ‬/‬ ‎ਉਸਮਾਨ‎).

r12a commented 1 year ago

Many thanks for your comments! In fact, i had been meaning to research that specific topic for a while, and your comments were very helpful, not least in reminding me to look into it. The page should now show a bunch of changes. The situation seems much clearer now.

I'll close this. Feel free to reopen it if you spot any errors wrt YEH+HAMZA. (Open a separate issue for any other issues.)

Thanks.