TraCES-Lexicon / lexicon

Geez lexicon of the TraCES project
1 stars 0 forks source link

input and output will take alternatively fidel or transcription #5

Closed PietroLiuzzo closed 5 years ago

PietroLiuzzo commented 7 years ago

lexicon should also provide this as a standalone service, providing transcription of a given string in both directions

PietroLiuzzo commented 7 years ago

support only transcription as in BM

PietroLiuzzo commented 7 years ago

Goal

provide correct transcription alternatives

Main actor(s)

all users human and applications

Short description

the user inputs a string, the lexicon identifies if it is in fidel (unicode ethiopic chart) or latin script. if it is latin script it will return the fidel if it is latin it will assume it is correct transcription and offer the best possible fidel

Preconditions

rules for transcription are available in the transcription.html BM tool. they reflect the current BM use but implement a very rudimental javascript technique. betamasaheft.aai.uni-hamburg.de/transcription.html

Example basic flow

A possible workflow is the one implemented in /transcription.html, where fidel is stored as a series of objects and so are the vowels. the foundamental difference from this string parsing tool will be that the lexicon might check the pattern and schemas to introduce shwa in the transcription as expected and to reduplicate consonants where needed, thus achieving a correct transcription. the response will return the request in the detected script and the other format, so that it will always be

{
fidel : 'ረዓድ',
transcription : 'raʿād'
}

Alternative flow

the same function is used for #2 so that it is irrelevant it the request comes in transcription or fidel, the morphological analysis will be performed in any case

PietroLiuzzo commented 7 years ago

Also @cvertan has a routine to do this

PietroLiuzzo commented 7 years ago

CV= CV or CCV or C

Consonant vowels can be one of the above see presentation bausi summerschool

PietroLiuzzo commented 7 years ago

@abausi I had noted this down on the first day of the summer school, but in Agora I cannot find the slide. Perhaps you could paste here the little schema you had on one of your slides about the consonant-vowel possible combinations?

abausi commented 7 years ago
    The Ethiopian script (fidäl)

The Ethiopic script is a syllabic script:

it is the only syllabic alphabet ever used by Semites;

it is classified among the syllabaries particularly widespread in India and Southern Asia;

each syllabic sign indicates sequences of

consonant+vowelor consonant+Ø;

it does not mark gemination nor Ø vowel:

= [CV], [CCV], [C], [CC]

26 signs x 7 vowel orders + 4 labiovelars x 5 orders + numerals

Consonants:

/h/, /l/, /ḥ/, /m/, /ś/, /r/, /s/, /q/, /b/, /t/, /ḫ/, /n/, /ʾ/, /k/, /w/, /ʿ/, /z/, /y/, /d/, /g/, /ṭ/, /ṗ/, /ṣ/, /ḍ/ṣ́/, /f/, /p/

ሀለሐመሠረሰቀበተኀነአከወዐዘየደገጠጰጸፀፈፐ

Laryng(e)als (L): /h/, /ḥ/, /ḫ/; /ʾ/, /ʿ/

Vowels:

I phase (length opposition):/Ca Cu Ci Cā Ce Cǝ Co/

II phase (qualitative opposition):/Cä Cu Ci Ca Ce Cǝ Co/

(with laryng(e)als)/La Lu Li La Le Lǝ Lo/

Mergings:

/h/, /ḥ/, /ḫ/ > [h] = <h, ḥ, ḫ>; /ʾ/, /ʿ/ > [ʾ] = <ʾ, ʿ>: La, Lā > [La] = <La, Lā]>

/s/, /ś/ > [s] = <s, ś>; /ṣ/, /ḍ/ṣ́/ > [ṣ] = <ṣ, ḍ/ṣ́>

Il 01.11.2017 21:09, Pietro Liuzzo ha scritto:

@abausi https://github.com/abausi I had noted this down on the first day of the summer school, but in Agora I cannot find the slide. Perhaps you could paste here the little schema you had on one of your slides about the consonant-vowel possible combinations?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/TraCES-Lexicon/lexicon/issues/5#issuecomment-341225848, or mute the thread https://github.com/notifications/unsubscribe-auth/ATDMaPKZZjC66Dg53EtCIVyRUoU2BNFkks5syM_dgaJpZM4M8rT4.