Closed PietroLiuzzo closed 5 years ago
@vitagrazia81 has lists of possible constructions of words for names and verbs, which might need to be further developed
make distinction for suffix pronomina clear on analysis level
Yes, it needs a further development and to be complete. For the moment the list does not cover all the nominal and verbal forms. For the nouns, there are only the ones for three and "normal" radicals and for the verbs it is still missing: the forms of verbs II and III W and of verbs with four and more radicals.
No sorry: for the nouns we have also the forms with I, II, III laryngeal.
use case: the user sends a request with a string query parameter, the response offers a list of possible morphological matches of the pattern organized by root as in the greek word study tool http://www.perseus.tufts.edu/hopper/morph?l=gignomai&la=greek the string can be given both as fidal or transcription (#5) is then analyzed by the lexicon which returns its root and the pattern associated. the pattern is matched to find out the possible morphological definitions on the basis of the tables provided and the root components are matched to possible roots relevant for the morphological patterns matched to provide results.
retrive morphological informations on single tokens for a string request
all users human and applications
the user sends a request with a string query parameter in fidel or transcription (#5) , the response offers a list of possible morphological matches of the pattern organized by root One main user will be the dillmann app #3
the greek word study tool http://www.perseus.tufts.edu/hopper/morph?l=gignomai&la=greek
There is already the root tool done by @cvertan which can be used as a starting point for this. There are also forms already compiled with this root tool which could be used for the intelligence in the lexicon app. the lexicon needs to know schemas and patterns. tables for these will be provided by @vitagrazia81 the dillmann dictionary api should respond to a request for a lemma with the id of the lemma as it is already possible with a query like http://betamasaheft.aai.uni-hamburg.de/api/Dillmann/search/form?q=ቄጵርስስ which returns the id of the lemma
the dictionary app will use to provide for any searched string also the morphological matches by sending first the queried string to the lexicon and then offering to search in the dictionary for one or the other of the results. #7
http://elvira.lllf.uam.es/jabalin/analizarForma.php
no match is found for the entered string. the lexicon returns an error with the problem encountered (e.g. cannot parse word, schema not found, pattern not found)
A list of matching roots, schemas and patterns is returned.
where does the tokenization happens? rules for the tokenization
ነገርክናኒ / nagarkǝnāni
Automatic transliteration it should figure out that this is a Perfective second person plural feminine with an object suffix pronoun in the first person singular it will look in the table of Perfective with Object Suffixes and find out that ni can be an object suffix Warnings: the table provided contains the possibilities not the patterns which will have to be elaborated in a table by the engine. This is a further requirement. a table with
nagarkǝnā - ni will have to be completed with a list of patterns like
1a2a304ǝ5ā6i where each couple represent a syllable and the first number in a couple is the consecutive numbering (starting with 1) and the second is the vowel in the schema/pattern
the input will be processed to this pattern as well and matched one to one. in most cases you will be lucky enough to have already the pattern possibility expressed. probably a list of unique patterns should also be compiled for quick lookup
Perfective second person plural feminine with an object suffix pronoun in the first person singular. "They (F) told me"
root | stem |
---|---|
nagara | ነገር |
a. nagarkǝnā -ni (note the bold) and one with a view splitting all the elements, so also the subject pronoun
b. nagar - kǝnā - ni
where for each part there is a link to the information related to that pattern nagara - perfect = table of the verbs kǝnā - subject pronoun = table of subject pronouns (conjugation of the perfect) ni object pronoun = table of the object pronouns
ትነግራኒ/ tǝnaggǝrāni 1ǝ2a304ǝ5ā6i
Automatic transliteration it should figure out that this is the Imperfective second person plural feminine with an object suffix in the first person singular
Imperfective second person plural feminine with an object suffix in the first person singular "They (F) will tell me" or "They (F) tell me"
root | stem |
---|---|
nagara | ነግር |
then it will provide a split view of the term in transliteration as in the imperfective with object suffixes table, thus a. tǝnaggǝrā - ni
and one with a view splitting all the elements, so also the subject pronoun
b. tǝ- naggǝr- ā - ni
where for each part there is a link to the information related to that pattern naggǝr = imperfective = table of the verbs tǝ- + - ā = subject pronoun = table of subject pronouns (Imperfective in the Indicative and Jussive Moods) ni object pronoun = table of the object pronouns
nagarkǝnā - ni Please note that the correct version is as follows: nagarkǝn-āni
Corrections done to the User Story 1: "it will look in the table of Perfective with Object Suffixes and find out that -āni can be an object suffix", nagara - perfect = table of the verbs kǝn - subject pronoun = table of subject pronouns (conjugation of the perfect) -āni object pronoun = table of the object pronouns
"a. nagarkǝn-āni (note the bold) b. nagar - kǝn-āni
a further tool to look at https://ethiopic-tool.firebaseapp.com/ by Garry Jost. A very intersting email exchange is available with instructions if needed the text once loaded
@MagdaKrzyz I'm having some trouble reproducing correct transliterations for User story 2 (ትነግራኒ/ tǝnaggǝrāni) in the testing phase with respect to ግ = ggǝ, which isn't easily computable from the Fidäl alone. The candidates I get for ትነግራኒ are [tnagrāni, tǝnagrāni, tnagǝrāni, tǝnagǝrāni], of which the last one seems the best candidate.
Do you have a systematic overview of transliterations rules somewhere?
As replied in the email it is not possible to produce the transliteration without having already knowledge of the pattern and having separated affixes.
on a string input analyze and return possible morphological matches and lemmas