biblissima / collatinus

Sources of Collatinus software - Latin lemmatizer, morphological analyzer and scansion
http://outils.biblissima.fr/en/collatinus
GNU General Public License v3.0
66 stars 15 forks source link

[feature request] spelling permutations #54

Closed nikita-moor closed 5 years ago

nikita-moor commented 5 years ago

Collatinus is quite smart in understanding of variative Latin spelling, but still fails "uva" test:

PhVerkerk commented 5 years ago

vva is an absurd form (except in upper-case, which is not handled anyway) : either you distinguish u and v (i.e. you write the vowel u as u and the semi-consonant u as v), or they collapse in u. The mistake is to recognize vua.

Philippe.

De: "chopinesque" notifications@github.com À: "biblissima/collatinus" collatinus@noreply.github.com Cc: "Subscribed" subscribed@noreply.github.com Envoyé: Dimanche 8 Septembre 2019 10:49:54 Objet: [biblissima/collatinus] [feature request] spelling permutations (#54)

Collatinus is quite smart in understanding of variative Latin spelling, but still fails "uva" test:

* uva \uD83D\uDDF8 
* vua \uD83D\uDDF8 
* uua \uD83D\uDDF8 
* vva — formes non reconnues 

— You are receiving this because you are subscribed to this thread. Reply to this email directly, [ https://github.com/biblissima/collatinus/issues/54?email_source=notifications&email_token=ACNY23GL7GHK672WDELPK3DQIS4DFA5CNFSM4IUSWXI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJ75OWA | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/ACNY23APLDT6536R3AZFXUDQIS4DFANCNFSM4IUSWXIQ | mute the thread ] . [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/biblissima/collatinus/issues/54?email_source=notifications\u0026email_token=ACNY23GL7GHK672WDELPK3DQIS4DFA5CNFSM4IUSWXI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJ75OWA", "url": "https://github.com/biblissima/collatinus/issues/54?email_source=notifications\u0026email_token=ACNY23GL7GHK672WDELPK3DQIS4DFA5CNFSM4IUSWXI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJ75OWA", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

nikita-moor commented 5 years ago

I suppose letters i/j and v/u are indistinguishable in Latin language, and expected that a lemmatizer should understand original inscriptions (iv-spelling): img

vva is an absurd form

However, sometimes it can be found, especially in headers:

Anyway, it was no more than advice and you have right to reject it.

PhVerkerk commented 5 years ago

Yes, at the beginning of a sentence "Vvae duracinae" should be understood. I have to check why it is not.

Ph.

De: "chopinesque" notifications@github.com À: "biblissima/collatinus" collatinus@noreply.github.com Cc: "Philippe Verkerk" Philippe.Verkerk@univ-lille1.fr, "Comment" comment@noreply.github.com Envoyé: Dimanche 8 Septembre 2019 16:51:12 Objet: Re: [biblissima/collatinus] [feature request] spelling permutations (#54)

I suppose letters i/j and v/u are indistinguishable in Latin language, and expected that a lemmatizer should understand original inscriptions (iv-spelling): [ https://camo.githubusercontent.com/4364d83a2f97bc69fea710d7b98661f0d497befa/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f342f34362f4170706975736361656375737374656c6530312e6a70672f34353070782d4170706975736361656375737374656c6530312e6a7067 ]

vva is an absurd form

However, sometimes it can be found, especially in headers:

* "Vvae duracinae." ( [ https://www.thelatinlibrary.com/martial/mart13.shtml | Mart. XIII ] ). 
* [ https://www2.uni-mannheim.de/mateo/camenaref/gesner/gesner1/v4/Estienne-Gesner_thesaurus_4-v-z.html | Gesner Thesaurus ] : [ https://user-images.githubusercontent.com/13879891/64489933-03dd3480-d261-11e9-892c-98c52d06fe7c.jpg ] 

Anyway, it was no more than advice and you have right to reject it.

— You are receiving this because you commented. Reply to this email directly, [ https://github.com/biblissima/collatinus/issues/54?email_source=notifications&email_token=ACNY23HNHVRJ2QJT4ETQVHLQIUGOBA5CNFSM4IUSWXI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6FRWBI#issuecomment-529210117 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/ACNY23CLA2SJWFUOMXUMKIDQIUGOBANCNFSM4IUSWXIQ | mute the thread ] . [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/biblissima/collatinus/issues/54?email_source=notifications\u0026email_token=ACNY23HNHVRJ2QJT4ETQVHLQIUGOBA5CNFSM4IUSWXI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6FRWBI#issuecomment-529210117", "url": "https://github.com/biblissima/collatinus/issues/54?email_source=notifications\u0026email_token=ACNY23HNHVRJ2QJT4ETQVHLQIUGOBA5CNFSM4IUSWXI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6FRWBI#issuecomment-529210117", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]