Open Jargonautika opened 2 years ago
In using this more, this is actually also true of a number of different pairs:
@Jargonautika
If this dictionary were to be used in its reverse form
We use this dictionary to predict the pronunciation of a given word. If I am not mistaken you are looking for a reverse function. We have many homophone words in Persian that have different meaning but same pronunciation. For example:
In this repository we focused on pronunciation. However there is another project that try to latinized Persian: Alefbaye 2om You may also check it out.
شاد و سلامت باشید
Obvious care has been taken to make sure the pronunciation of ه (hā-ye do-češm) has been differentiated between its pronunciation as /h/ (word-initially and -medially) and /e/ (word-finally) as in:
Examples like that above are useful to make sure when the grapheme should be pronounced as [h] or [e]. However, there does not seem to be a distinction between the [h] pronunciation of ه (hā-ye do-češm) and the [h] pronunciation of ح (ḥâ-ye ḥotti / ḥâ-ye jimi) anywhere in the dictionary. Consider the following examples:
If this dictionary were to be used in its reverse form, [1] could be reconstituted from "h A d e s e" into either "هادِثه" or "حادِثه". This is certainly a niche issue, but I am trying to diacritize non-diacritized text, and so in order to re-constitute the original text I have with included vowels given your dictionary's scheme, I need to know which Farsi character to convert back to in the end. There are no instances in either dictionary where ḥâ-ye jimi and hā-ye do-češm (pronounced as [h]) appear in the same word so a simple string replace should do it there. I suggest replacing [h] with [H] for ḥâ-ye jimi to make this dictionary reversible.
It may well be that there are no words in Farsi which contain both ح and ه, but this would solve the edge use-case I describe here.
Thanks for your work!