Closed skurzinz closed 3 years ago
I was just looking at the regular Turkish the other day, it appears to have a couple issues, notably some historical glyphs listed that aren't currently part of the orthography.
I'm not much of an expert on Ottoman, but the gist of this looks right.
I guess it may be an option to just import/inherit the standard arabic (arb
) instead of literally including the glyphs. I did not find an example of this in the database file with multiple orthographies.
NB historically the armenian alphabet was also sometimes used to write ota
, but as I am completely lost at this I won't even try including this.
There are other transcription alphabets for ota
available as well, but the glyph coverage should not differ much. I just went with IJMES as I am using it for an edition project through https://github.com/QHOD/ota-keyboard.
Thanks for the contribution, it is added for the next release. I've tweaked the data so the Arabic is inherited from arb
for cleaner representation. Feel free to submit an addition to CONTRIBUTORS.txt
if you want to get listed (persons, not organizations) 👍
@skurzinz Sorry for backtracking after the merge; we're just reviewing marks in orthographies and this one popped up. I see you've listed combining macron below (U+0331) and combining minus below (U+0320), but do not used them in any of the characters of base
.
I see transliteration on e.g. Wikipedia has e.g. s
with macron below.
Would it make sense to add those macron below characters? These being dropped might also have been a result of applying hyperglot-save
and "pruning" the data (which we are changing). And is the inclusion of U+0320 a mistake or what is it used for?
@kontur thanks for notifying me of this error. U+0320 is a mistake on my side. Likely I copy-pasted from somewhere without noticing, writing directly in the Github editor and not in some hex aware environment :) Another error is not including the SsZs with macron below in the base character list. Both was already present in my original PR.
The official transliteration table of IJMES is available as a PDF only. It would also be applicable to Arabic, (Modern) Turkish (if written in arb
and Persian/Farsi transcription.
S/s with macron below is not available as a combined character in Unicode. I'll see if I can fix my errors and submit a new PR.
For the record: For finding the culprit I was successful in using VSCode and the https://github.com/medo64/code-point/ extension. Atom and https://atom.io/packages/character-table did not work for the purpose.
I did not do any tests as I don't currently have a working python environment at hand. Please check before merging.