UniversalDependencies / UD_French-GSD

Other
26 stars 5 forks source link

Verbe cuire #9

Closed gloignon closed 5 years ago

gloignon commented 5 years ago

Dans UD2.3, certaines conjugaisons du verbe "cuire" sont lemmatisées comme étant le verbe "cuiser" plutôt que "cuire". Par exemple:

With UD2.3, some tenses of the verb "cuire" are lemmatized as "cuiser". For example:

Output:

# sent_id = 1
# text = Nous cuisons depuis bien trop longtemps déjà.
1   Nous    nous    PRON    _   Number=Plur|Person=1|PronType=Prs   _   _   _   _
2   cuisons cuire   VERB    _   Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin   _   _   _   _
3   depuis  depuis  ADP _   _   _   _   _   _
4   bien    bien    ADV _   _   _   _   _   _
5   trop    trop    ADV _   _   _   _   _   _
6   longtemps   longtemps   ADV _   _   _   _   _   _
7   déjà    déjà    ADV _   _   _   _   _   SpaceAfter=No
8   .   .   PUNCT   _   _   _   _   _   SpacesAfter=\n

# sent_id = 2
# text = Les nouilles cuisaient au jus de canne.
1   Les le  DET _   Definite=Def|Gender=Fem|Number=Plur|PronType=Art    _   _   _   _
2   nouilles    nouille NOUN    _   Gender=Fem|Number=Plur  _   _   _   _
3   cuisaient   cuiser  VERB    _   Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin    _   _   _   _
4-5 au  _   _   _   _   _   _   _   _
4   à   à   ADP _   _   _   _   _   _
5   le  le  DET _   Definite=Def|Gender=Masc|Number=Sing|PronType=Art   _   _   _   _
6   jus jus NOUN    _   Gender=Masc|Number=Sing _   _   _   _
7   de  de  ADP _   _   _   _   _   _
8   canne   canne   NOUN    _   Gender=Fem|Number=Sing  _   _   _   SpaceAfter=No
9   .   .   PUNCT   _   _   _   _   _   SpaceAfter=No
dseddah commented 5 years ago

interesting :) I wonder what lemmatizer they used.

bguil commented 5 years ago

The examples you give do not come from the corpus:

I guess that your examples are outputs of the UDPIPE model trained on UD 2.3.

There are not enough data in the UD corpus to "learn" the lemma cuire correctly, so I guess that the parser use the most productive verbs of French "1er groupe" and predict by a kind of analogy the lemma cuiser