interscript / maps

Script conversion maps for Interscript
2 stars 1 forks source link

alalc-ara-Arab-1997 issues #104

Open AhMohsen46 opened 3 years ago

AhMohsen46 commented 3 years ago

in Rule 4 أُولَائِكَ is transliterated to ulā’ika shouldn't it be ūlā’ika based on the "damma" before the "waw"?

image

also; in the below screenshot from doc كُتُب إقتَنَتهَا is transliterated to kutub iqtanatʹhā similarly مَعرِفَة مَا يَجِبُ لَهُم to ma‘rifat mā yajibu la-hum

is it based on rule 21?, if yes, is there a set of letters combination that might be separated with prime? ʹ image

another issue in إلَى يَومِنَا هَذَا transliterated to ilá yawminā hādhā

is it a rule that هَ is transliterated to instead of ha? or is it following rule 19?, if yes, is there a set of words following it?, or a general rule to estimate words written defectively?

AhMohsen46 commented 3 years ago

for Rule 15 which is more into grammar, Pronouns, pronominal suffixes, and demonstratives, it might need some ML, however, I can transliterate waw al-atf "similar to AND in english" so it'll work for examples like this

but it may collide with some examples like this

for Rule 18 which is capitalization our system is able to capitalize based on this rule for Names, and definite articles, but not other types of words -which i assume may need ML processing-, if this sounds like okay, I may apply it as if it's dealing with names, for geolocations, I can apply it as in the other systems

for Rule 19, it definitely needs ML, very hard to transliterate it

Rule 21 too might be in need of ML

and from Rule 23 till the end are entitled Examples of Irregular Arabic Orthography

so I can either make a dictionary for them at least for words like Allāh as it's quite common to be in streets names in composite names like Abdullah and names like this

AhMohsen46 commented 3 years ago

https://github.com/interscript/interscript/pull/402/files