UniversalDependencies / UD_Faroese-OFT

Other
1 stars 2 forks source link

Constructions with comparative and comparand #5

Closed ftyers closed 6 years ago

ftyers commented 6 years ago

What should we do with constructions like Adj-Comp enn Noun:

# text = Streymgágga er smidligari enn agngággan.
# text[eng] = Red whelk is nicer than normal whelk.
# labels = to_check
"<Streymgágga>"
        "streymgágga" N Fem Sg Nom Indef @nsubj #1->3
"<er>"
        "vera" V Ind Prs Sg3 @cop #2->3
"<smidligari>"
        "smidligur" A Cmp Fem Sg Nom Indef @root #3->0
"<enn>"
        "enn" Adv @dep #4->5
"<agngággan>"
        "agngágga" N Fem Sg Nom Def @dep #5->3
"<.>"
        "." CLB @punct #6->3

Bokmål

Bokmål has the comparand as obl to the adjective.

# sent_id =  002340
# text = Noen skatteflyktninger er likere enn andre
1       Noen    noen    DET     _       Number=Plur|PronType=Ind        2       det     _       _
2       skatteflyktninger       skatteflyktning NOUN    _       Definite=Ind|Gender=Masc|Number=Plur    4       nsubj   _       _
3       er      være    AUX     _       Mood=Ind|Tense=Pres|VerbForm=Fin        4       cop     _       _
4       likere  lik     ADJ     _       Degree=Cmp      0       root    _       _
5       enn     enn     ADP     _       _       6       case    _       _
6       andre   annen   DET     _       Number=Plur|PronType=Dem        4       obl     _       _

Nynorsk

Nynorsk is as Bokmål.

# sent_id =  004020
# text = Ulike forvaltingsideologiar eller kunnskapsregime har difor vore viktigare enn røystesetelen.
1       Ulike   ulike   ADJ     _       Degree=Pos|Number=Plur  2       amod    _       _
2       forvaltingsideologiar   forvaltingsideologi     NOUN    _       Definite=Ind|Gender=Masc|Number=Plur    8       nsubj   _       _
3       eller   eller   CCONJ   _       _       4       cc      _       _
4       kunnskapsregime kunnskapsregime NOUN    _       Definite=Ind|Gender=Neut|Number=Plur    2       conj    _       _
5       har     ha      AUX     _       Mood=Ind|Tense=Pres|VerbForm=Fin        8       aux     _       _
6       difor   difor   ADV     _       _       8       advmod  _       _
7       vore    vere    AUX     _       VerbForm=Part   8       cop     _       _
8       viktigare       viktig  ADJ     _       Degree=Cmp      0       root    _       _
9       enn     enn     ADP     _       _       10      case    _       _
10      røystesetelen   røystesetel     NOUN    _       Definite=Def|Gender=Masc|Number=Sing    8       obl     _       SpaceAfter=No
11      .       $.      PUNCT   _       _       8       punct   _       _

Swedish

The Swedish example is slightly different as it's with an adverb not a verb, but I guess the principle here is ellipsis, e.g. Folkmängden ökar fortare än läskunnigheten [ökar].

# sent_id = sv-ud-test-479
# text = Folkmängden ökar fortare än läskunnigheten.
1       Folkmängden     folkmängd       NOUN    NN|UTR|SIN|DEF|NOM      Case=Nom|Definite=Def|Gender=Com|Number=Sing    2       nsubj   _       _
2       ökar    öka     VERB    VB|PRS|AKT      Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act      0       root    _       _
3       fortare fort    ADV     AB|KOM  Degree=Cmp      2       advmod  _       _
4       än      än      CCONJ   KN      _       5       mark    _       _
5       läskunnigheten  läskunnighet    NOUN    NN|UTR|SIN|DEF|NOM      Case=Nom|Definite=Def|Gender=Com|Number=Sing    2       advcl   _       SpaceAfter=No
6       .       .       PUNCT   MAD     _       2       punct   _       _

Danish

Again, slightly different but Danish appears to have mark and obl, which looks like it might be a mistake.

# sent_id = dev-143
# text = Jeg siger til ham , at min længsel efter ham er stærkere end smerten .
1       Jeg     jeg     PRON    _       Case=Nom|Gender=Com|Number=Sing|Person=1|PronType=Prs   2       nsubj   _       _
2       siger   sige    VERB    _       Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act      0       root    _       _
3       til     til     ADP     _       AdpType=Prep    4       case    _       _
4       ham     han     PRON    _       Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs   2       obl     _       _
5       ,       ,       PUNCT   _       _       2       punct   _       _
6       at      at      SCONJ   _       _       12      mark    _       _
7       min     min     DET     _       Gender=Com|Number=Sing|Number[psor]=Sing|Person=1|Poss=Yes|PronType=Prs 8       det     _       _
8       længsel længsel NOUN    _       Definite=Ind|Gender=Com|Number=Sing     12      nsubj   _       _
9       efter   efter   ADP     _       AdpType=Prep    10      case    _       _
10      ham     han     PRON    _       Case=Acc|Gender=Com|Number=Sing|Person=3|PronType=Prs   8       nmod    _       _
11      er      være    AUX     _       Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act      12      cop     _       _
12      stærkere        stærk   ADJ     _       Degree=Cmp      2       amod    _       _
13      end     end     ADP     _       _       14      mark    _       _
14      smerten smerte  NOUN    _       Definite=Def|Gender=Com|Number=Sing     12      obl     _       _
15      .       .       PUNCT   _       _       2       punct   _       _
jnivre commented 6 years ago

Guidelines say "case + obl", not "mark + advcl", when the comparative clause is reduced to a single noun phrase. Swedish has been fixed in v2.2. :)

dan-zeman commented 6 years ago

See here http://universaldependencies.org/workgroups/comparatives.html

jnivre commented 6 years ago

Wow! Dan, if we just appoint you (and only you) to all the working groups, all our problems will be solved. :)

ftyers commented 6 years ago

Thanks @jnivre and @dan-zeman ! :) The comparative documentation looks great! I've added an issue for the Danish one on their issues page.