bguil / UD-French-discussion

Discussions sur l'harmonisation des Treebanks du français au format UD
0 stars 1 forks source link

Add information about the POS of the whole MWE #11

Open bguil opened 6 years ago

bguil commented 6 years ago

In case of MWE, the POS of the whole expression is not given.

For instance, below, en is ADP, particulier is ADJ but en particulier is ADV.

12  et  et  CCONJ   _   _   17  cc  _   _
13  en  en  ADP _   _   12  advmod  _   _
14  particulier particulier ADJ _   Gender=Masc|Number=Sing 13  fixed   _   _
15  à   à   ADP _   _   17  case    _   _
16  l'  le  DET _   Definite=Def|Number=Sing|PronType=Art   17  det _   SpaceAfter=No
17  inspecteur  inspecteur  NOUN    _   Gender=Masc|Number=Sing 9   conj    _   _

There is not standard in UD for this.

This problem is mentioned here. I propose to follow UD_Portuguese treebank and to use the MISC column for this. For instance :

7   ,   ,   PUNCT   PU|@PU  _   6   punct   _   _
8   por por ADP PRP|@<ADVL  _   6   cc  _   MWE=por_exemplo|MWEPOS=CCONJ
9   exemplo exemplo NOUN    N|M|S|@P<   Gender=Masc|Number=Sing 8   fixed   _   SpaceAfter=No
10  ,   ,   PUNCT   PU|@PU  _   15  punct   _   _

And then, for en particulier, we would have:

12  et  et  CCONJ   _   _   17  cc  _   _
13  en  en  ADP _   _   12  advmod  _   MWE=en_particulier|MWEPOS=ADV
14  particulier particulier ADJ _   Gender=Masc|Number=Sing 13  fixed   _   _
15  à   à   ADP _   _   17  case    _   _
16  l'  le  DET _   Definite=Def|Number=Sing|PronType=Art   17  det _   SpaceAfter=No
17  inspecteur  inspecteur  NOUN    _   Gender=Masc|Number=Sing 9   conj    _   _