Closed munzirtaha closed 1 year ago
Indeed; this is how most text-based mushafs chose to write it though. I wrote it the latter way (see https://github.com/quranacademy/quran-text).
Thanks for the quick response.
I checked khaledhosny/quran-data and it's correct there. I also checked tanzil.net mushaf and it's correct. The qurancomplex mushaf has lots of issues regarding encoding and it doesn't conform to the unicode standard so don't use it as a reference.
So, there are two issues here. The issue with the text and the issue with the font which is currently cannot render ٱلۡـَٔانَ correctly
This is the same as tanzil: ٱلْـَٰٔنَ. See https://nuqayah.com/digitalkhatt.html
Edit: I see you're referring to src/qurantext/quran.cpp. Nevermind then. Not sure why that's different. In any case I use tanzil's text with this font.
The font renders the word ٱلْأٓنَ (Surah Al-Jinn, Ayah 9) as follows
and the word ٱلۡـَٔانَ as follows
If I understand the problem correctly, the issue is that ٱلْأٓنَ must be rendered like the second image above and therefore the first word should be represented by الْآنَ and not ٱلْأٓنَ.
However, I checked with tanzil.net and it doesn't encode the two words as you suggested (you have to choose Uthmani text).
In both cases it uses TATWEEL
with HAMZA ABOVE
instead of MADDAH ABOVE
. If I have to follow tanzil.net, the second form can't be rendered anymore with the current encoding ٱلۡـَٔانَ and will be rendered like the first image (That's probably how it should be).
https://nuqayah.com/digitalkhatt.html still has the same issue and it's not like tanzil. Try to copy the word from that page and apply Courier New font e.g. and see the difference.
Just to make sure we are testing the same font. I am currently using https://digitalkhatt.org/assets/fonts/digitalkhatt.otf which I couldn't find it in github.
https://github.com/DigitalKhatt/madinafont/blob/main/digitalkhatt.otf is not working in LibreOffice and cannot be opened with fontforge. I expected this to be the OpenType version since the variable version is at visualmetafont/blob/master/files/digitalkhatt-cff2.otf
Edit: LibreOffice in LInux now supports CFF2 so I replaced the unmaintained version.
https://digitalkhatt.org/assets/fonts/digitalkhatt.otf is the old OpenType CFF1 version and unfortunately it is no longer maintained. The latest version is digitalkhatt.otf from the current repository in OpenType CFF2 format.
It seems that this format is not supported by LibreOffice or Windows systems, but there are probably tools that can convert a CFF2 format to supported formats.
If I understand the problem correctly, the issue is that ٱلْأٓنَ must be rendered like the second image above
No, ٱلْأٓنَ contains ALEF WITH HAMZA ABOVE + MADDAH ABOVE which is simply wrong and shouldn't be used in the Quran text. The font should render it as it is without hiding any of them.
and therefore the first word should be represented by الْآنَ and not ٱلْأٓنَ.
You can choose to represent it this way which is semantically correct or you can choose to follow tanzil encoding
However, I checked with tanzil.net and it doesn't encode the two words as you suggested (you have to choose Uthmani text).
In both cases it uses
TATWEEL
withHAMZA ABOVE
instead ofMADDAH ABOVE
.
Correct, but the letters after the HAMZA made the difference
If I have to follow tanzil.net, the second form can't be rendered anymore with the current encoding ٱلۡـَٔانَ and will be rendered like the first image (That's probably how it should be).
Why? you can follow tanzil convention and differentiate them by the ALEF or SUPERSCRIPT ALEF after them. The first form is: TATWEEL + HAMZA ABOVE + ALEF The second form is: TATWEEL + HAMZA ABOVE + TATWEEL + SUPERSCRIPT ALEF
https://nuqayah.com/digitalkhatt.html still has the same issue and it's not like tanzil. Try to copy the word from that page and apply Courier New font e.g. and see the difference.
Looks the same to me. I copied the ayaat from tanzil, so it must be the same. I have Uthmani selected in tanzil.
@mustafa0x The word is written in two shapes depending on the Surah. e.g Al-Jinn:9 and Al-Baqara:71. They look different in Mushaf Al-Madina so they are encoded in two different ways so as to not look the same. Copy the two ayaat and try to see the difference.
The second form is: TATWEEL + HAMZA ABOVE + TATWEEL + SUPERSCRIPT ALEF
Then I have to use TATWEEL + HAMZA ABOVE + TATWEEL + SUPERSCRIPT ALEF to render the second form below Now it looks like the problem has gone from mis-encoding the first form above to mis-encoding the second form.
Actually, you should use TATWEEL + HAMZA ABOVE + FATHA + ALEF to render this form
I looked a bit why we can't encode this form
with HAMZA (U+0621) + FATHA + ALEF and use TATWEEL + HAMZA ABOVE + FATHA + ALEF to the other form below (even this is not the correct form) .
The reason is that HAMZA is non-joining and should break two joining letters (i.e. الءان).
This is an old issue proposing to change property of HAMZA to be chairless (inline, amphibious) or add a new chairless HAMZA character, but nothing has been done so far. So the convention seems to use TATWEEL + HAMZA ABOVE to encode a chairless HAMZA.
However, I still wonder why in the case of the Arabic language the HAMAZA was not represented as chairless (since it is always the case for the Arabic language) and therefore we do not have to use the combination TATWEEL + HAMZA ABOVE?
Anyway, for now I'll change the encoding as requested (and sacrifice the second form) using the TATWEEL + HAMZA ABOVE convention like other Quran texts.
I wonder why not just encoding it as الآن? It's the normal way of writing it and it's not used elsewhere in Mushaf and any other font will fallback to a correct form. You can even add a style set later to switch between the two forms if someone wants to use your font for non-Quranic text.
Thanks so much for the fix, but you missed fixing the text in src/qurantext/quran.cpp
I did not push the changes.. Done now.
Great project indeed! Thanks.
The text in src/qurantext/quran.cpp has ٱلْأٓنَ rendered properly but it's wrong. There is a MADDAH ABOVE and an ALEF WITH HAMZA ABOVE both together. It could be written as ٱلْـَٔانَ