UniversalDependencies / UD_English-GUM

Other
30 stars 4 forks source link

Question about "be" #35

Closed muchang closed 2 years ago

muchang commented 3 years ago

Hi, I have a question about "be". In GUM, sometimes VBZ "be" is labeled as "AUX" and sometimes VBZ "be" is labeled as "VERB". I do not find a clear rule to judge whether the "be" should be AUX or VERB. Do you have any ideas?

For example, the "be"s in the following sentences are not quite different but have different labels.

# sent_id = GUM_interview_herrick-60
# s_type = decl
# speaker = JackHarris
# text = My hope is that over time, internet users will demand this of any site where they invest their time in a way that creates value for others.
1   My  my  PRON    PRP$    Number=Sing|Person=1|Poss=Yes|PronType=Prs  2   nmod:poss   2:nmod:poss Discourse=joint:153->150|Entity=(abstract-128(person-2-Jack_Herrick)
2   hope    hope    NOUN    NN  Number=Sing 3   nsubj   3:nsubj _
3   is  be  VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   0   root    0:root  _
4   that    that    SCONJ   IN  _   11  mark    11:mark _
5   over    over    ADP IN  _   6   case    6:case  _
6   time    time    NOUN    NN  Number=Sing 11  obl 11:obl:over SpaceAfter=No
7   ,   ,   PUNCT   ,   _   6   punct   6:punct _
8   internet    Internet    NOUN    NN  Number=Sing 9   compound    9:compound  Entity=(person-129(abstract-130)
9   users   user    NOUN    NNS Number=Plur 11  nsubj   11:nsubj    Entity=person-129)
10  will    will    AUX MD  VerbForm=Fin    11  aux 11:aux  _
11  demand  demand  VERB    VB  VerbForm=Inf    3   ccomp   3:ccomp _
12  this    this    PRON    DT  Number=Sing|PronType=Dem    11  obj 11:obj  Entity=(abstract-114)
13  of  of  ADP IN  _   15  case    15:case _
14  any any DET DT  _   15  det 15:det  Entity=(abstract-131
15  site    site    NOUN    NN  Number=Sing 12  nmod    12:nmod:of  _
16  where   where   SCONJ   WRB PronType=Int    18  mark    18:mark Discourse=elaboration:154->153
17  they    they    PRON    PRP Case=Nom|Number=Plur|Person=3|PronType=Prs  18  nsubj   18:nsubj    _
18  invest  invest  VERB    VBP Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin   15  acl 15:acl:where    _
19  their   their   PRON    PRP$    Number=Plur|Person=3|Poss=Yes|PronType=Prs  20  nmod:poss   20:nmod:poss    Entity=(time-132(person-129)
20  time    time    NOUN    NN  Number=Sing 18  obj 18:obj  Entity=abstract-131)time-132)
21  in  in  ADP IN  _   23  case    23:case _
22  a   a   DET DT  Definite=Ind|PronType=Art   23  det 23:det  Entity=(abstract-133
23  way way NOUN    NN  Number=Sing 18  obl 18:obl:in|25:nsubj  _
24  that    that    PRON    WDT PronType=Rel    25  nsubj   23:ref  Discourse=elaboration:155->154
25  creates create  VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   23  acl:relcl   23:acl:relcl    _
26  value   value   NOUN    NN  Number=Sing 25  obj 25:obj  Entity=(abstract-134
27  for for ADP IN  _   28  case    28:case _
28  others  other   NOUN    NNS Number=Plur 25  obl 25:obl:for  Entity=(person-135)abstract-133)abstract-134)|SpaceAfter=No
29  .   .   PUNCT   .   _   3   punct   3:punct Entity=abstract-128)
# sent_id = GUM_bio_nida-24
# s_type = decl
# text = His most notable contribution to translation theory is Dynamic Equivalence, also known as Functional Equivalence.
1   His his PRON    PRP$    Gender=Masc|Number=Sing|Person=3|Poss=Yes|PronType=Prs  4   nmod:poss   4:nmod:poss Discourse=joint:45->38|Entity=(abstract-6-Dynamic_and_formal_equivalence(person-1-Eugene_Nida)
2   most    most    ADV RBS Degree=Sup  3   advmod  3:advmod    _
3   notable notable ADJ JJ  Degree=Pos  4   amod    4:amod  _
4   contribution    contribution    NOUN    NN  Number=Sing 10  nsubj   10:nsubj    _
5   to  to  ADP IN  _   7   case    7:case  _
6   translation translation NOUN    NN  Number=Sing 7   compound    7:compound  Entity=(abstract-72(abstract-11-Translation)
7   theory  theory  NOUN    NN  Number=Sing 4   nmod    4:nmod:to   Entity=abstract-6-Dynamic_and_formal_equivalence)abstract-72)
8   is  be  AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   10  cop 10:cop  _
9   Dynamic Dynamic ADJ NNP Degree=Pos  10  amod    10:amod Entity=(abstract-6-Dynamic_and_formal_equivalence
10  Equivalence Equivalence PROPN   NNP Number=Sing 0   root    0:root  Entity=abstract-6-Dynamic_and_formal_equivalence)|SpaceAfter=No
11  ,   ,   PUNCT   ,   _   13  punct   13:punct    _
12  also    also    ADV RB  _   13  advmod  13:advmod   Discourse=elaboration:46->45
13  known   know    VERB    VBN Tense=Past|VerbForm=Part    10  acl 10:acl  _
14  as  as  ADP IN  _   16  case    16:case _
15  Functional  Functional  ADJ NNP Degree=Pos  16  amod    16:amod Entity=(abstract-6-Dynamic_and_formal_equivalence
16  Equivalence Equivalence PROPN   NNP Number=Sing 13  obl 13:obl:as   Entity=abstract-6-Dynamic_and_formal_equivalence)|SpaceAfter=No
17  .   .   PUNCT   .   _   10  punct   10:punct    _
amir-zeldes commented 3 years ago

Thanks for reporting - the upos for "be" (and in general) is determined by a combination of the xpos and rules based on the dependency tree - the first example is an error (be should be cop). I'll fix the tree and the upos will be automatically fixed on the next release.

muchang commented 3 years ago

Thanks, Amir. I notice that the VERB VBZ root "be"s usually appear in the term "There is". Such as,

Word: "is"

# sent_id = GUM_fiction_veronique-40
# s_type = decl
# speaker = Protagonist
# text = Then there is a big problem on Earth, and the people of Earth forget we are here.
1   Then    then    ADV RB  PronType=Dem    3   advmod  3:advmod    Discourse=cause:71->72
2   there   there   PRON    EX  _   3   expl    3:expl  _
3   is  be  VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   0   root    0:root  _
4   a   a   DET DT  Definite=Ind|PronType=Art   6   det 6:det   Entity=(place-39
5   big big ADJ JJ  Degree=Pos  6   amod    6:amod  _
6   problem problem NOUN    NN  Number=Sing 3   nsubj   3:nsubj _
7   on  on  ADP IN  _   8   case    8:case  _
8   Earth   Earth   PROPN   NNP Number=Sing 6   nmod    6:nmod:on   Entity=(place-12-Earth)place-39)|SpaceAfter=No
9   ,   ,   PUNCT   ,   _   15  punct   15:punct    _
10  and and CCONJ   CC  _   15  cc  15:cc   Discourse=attribution:72->73
11  the the DET DT  Definite=Def|PronType=Art   12  det 12:det  Entity=(person-38
12  people  person  NOUN    NNS Number=Plur 15  nsubj   15:nsubj    _
13  of  of  ADP IN  _   14  case    14:case _
14  Earth   Earth   PROPN   NNP Number=Sing 12  nmod    12:nmod:of  Entity=(place-12-Earth)person-38)
15  forget  forget  VERB    VBP Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   3   conj    3:conj:and  _
16  we  we  PRON    PRP Case=Nom|Number=Plur|Person=1|PronType=Prs  17  nsubj   17:nsubj    Discourse=sequence:73->69|Entity=(person-36)
17  are be  VERB    VBP Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin   15  ccomp   15:ccomp    _
18  here    here    ADV RB  PronType=Dem    17  advmod  17:advmod   Entity=(place-23)|SpaceAfter=No
19  .   .   PUNCT   .   _   3   punct   3:punct _

Is it intended?

And also some suspicious cases below: Word: "is"

# sent_id = GUM_academic_lighting-34
# s_type = decl
# text = In addition to the harmful effects of mercury is that it emits Ultraviolet (UV) Radiation.
1   In  in  ADP IN  _   2   case    2:case  Discourse=joint:78->74
2   addition    addition    NOUN    NN  Number=Sing 9   obl 9:obl:in    _
3   to  to  ADP IN  _   6   case    6:case  _
4   the the DET DT  Definite=Def|PronType=Art   5   det 5:det   Entity=(abstract-36
5   harmful harmful ADJ JJ  Degree=Pos  6   amod    6:amod  _
6   effects effect  NOUN    NNS Number=Plur 2   nmod    2:nmod:to   _
7   of  of  ADP IN  _   8   case    8:case  _
8   mercury mercury NOUN    NN  Number=Sing 6   nmod    6:nmod:of   Entity=(substance-116)abstract-36)
9   is  be  VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   0   root    0:root  _
10  that    that    SCONJ   IN  _   12  mark    12:mark _
11  it  it  PRON    PRP Case=Nom|Gender=Neut|Number=Sing|Person=3|PronType=Prs  12  nsubj   12:nsubj    Entity=(substance-116)
12  emits   emit    VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   9   csubj   9:csubj _
13  Ultraviolet ultraviolet ADJ JJ  Degree=Pos  17  amod    17:amod Entity=(abstract-38
14  (   (   PUNCT   -LRB-   _   15  punct   15:punct    Discourse=restatement:79->78|SpaceAfter=No
15  UV  UV  ADJ JJ  Degree=Pos  13  appos   13:appos    SpaceAfter=No
16  )   )   PUNCT   -RRB-   _   15  punct   15:punct    _
17  Radiation   radiation   NOUN    NN  Number=Sing 12  obj 12:obj  Discourse=same-unit:80->78|Entity=abstract-38)|SpaceAfter=No
18  .   .   PUNCT   .   _   9   punct   9:punct _
# sent_id = GUM_academic_lighting-20
# s_type = decl
# text = Now that the world is in the age where lighting seems to be a daily necessity, typical homes as shown in figure 1, consume nearly 27 percent of the energy used today: making lighting as the major source of electricity consumption.
1   Now now ADV RB  _   26  advmod  26:advmod   Discourse=circumstance:38->40
2   that    that    SCONJ   IN  _   5   mark    5:mark  _
3   the the DET DT  Definite=Def|PronType=Art   4   det 4:det   Entity=(place-74
4   world   world   NOUN    NN  Number=Sing 5   nsubj   5:nsubj Entity=place-74)
5   is  be  VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   1   acl 1:acl:that  _
6   in  in  ADP IN  _   8   case    8:case  _
7   the the DET DT  Definite=Def|PronType=Art   8   det 8:det   Entity=(time-75
8   age age NOUN    NN  Number=Sing 5   obl 5:obl:in    _
9   where   where   SCONJ   WRB PronType=Rel    11  mark    11:mark Discourse=elaboration:39->38
10  lighting    lighting    NOUN    NN  Number=Sing 11  nsubj   11:nsubj|16:nsubj:xsubj Entity=(abstract-20)
11  seems   seem    VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   8   acl:relcl   8:acl:relcl _
12  to  to  PART    TO  _   16  mark    16:mark _
13  be  be  AUX VB  VerbForm=Inf    16  cop 16:cop  _
14  a   a   DET DT  Definite=Ind|PronType=Art   16  det 16:det  Entity=(abstract-76
15  daily   daily   ADJ JJ  Degree=Pos  16  amod    16:amod _
16  necessity   necessity   NOUN    NN  Number=Sing 11  xcomp   11:xcomp    Entity=time-75)abstract-76)|SpaceAfter=No
17  ,   ,   PUNCT   ,   _   1   punct   1:punct _
18  typical typical ADJ JJ  Degree=Pos  19  amod    19:amod Discourse=evidence:40->44|Entity=(place-77
19  homes   home    NOUN    NNS Number=Plur 26  nsubj   26:nsubj    _
20  as  as  SCONJ   IN  _   21  mark    21:mark Discourse=evidence:41->40
21  shown   show    VERB    VBN Tense=Past|VerbForm=Part    19  acl 19:acl:as   _
22  in  in  ADP IN  _   23  case    23:case _
23  figure  Figure  PROPN   NNP Number=Sing 21  obl 21:obl:in   Entity=(abstract-70
24  1   1   NUM CD  NumForm=Digit|NumType=Card  23  dep 23:dep  Entity=place-77)abstract-70)|SpaceAfter=No
25  ,   ,   PUNCT   ,   _   19  punct   19:punct    _
26  consume consume VERB    VBP Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin   0   root    0:root  Discourse=same-unit:42->40
27  nearly  nearly  ADV RB  Degree=Pos  28  advmod  28:advmod   Entity=(abstract-78
28  27  27  NUM CD  NumForm=Digit|NumType=Card  29  nummod  29:nummod   _
29  percent percent NOUN    NN  Number=Sing 26  obj 26:obj  _
30  of  of  ADP IN  _   32  case    32:case _
31  the the DET DT  Definite=Def|PronType=Art   32  det 32:det  Entity=(abstract-29
32  energy  energy  NOUN    NN  Number=Sing 29  nmod    29:nmod:of  _
33  used    use VERB    VBN Tense=Past|VerbForm=Part    32  acl 32:acl  Discourse=elaboration:43->42
34  today   today   NOUN    NN  Number=Sing 33  obl:tmod    33:obl:tmod Entity=(time-17)abstract-78)abstract-29)|SpaceAfter=No
35  :   :   PUNCT   :   _   36  punct   36:punct    _
36  making  make    VERB    VBG VerbForm=Ger    26  advcl   26:advcl    Discourse=joint:44->28
37  lighting    lighting    NOUN    NN  Number=Sing 36  obj 36:obj  Entity=(abstract-20)
38  as  as  ADP IN  Typo=Yes    41  case    41:case _
39  the the DET DT  Definite=Def|PronType=Art   41  det 41:det  Entity=(abstract-20
40  major   major   ADJ JJ  Degree=Pos  41  amod    41:amod _
41  source  source  NOUN    NN  Number=Sing 36  obl 36:obl:as   _
42  of  of  ADP IN  _   44  case    44:case _
43  electricity electricity NOUN    NN  Number=Sing 44  compound    44:compound Entity=(event-79(abstract-56)
44  consumption consumption NOUN    NN  Number=Sing 41  nmod    41:nmod:of  Entity=abstract-20)event-79)|SpaceAfter=No
45  .   .   PUNCT   .   _   26  punct   26:punct    _
amir-zeldes commented 3 years ago

Existential "there is" is considered to be a non-auxiliary, main predicate, similar to the verb "exist", so the first example is not an error. The last example is definitely an error, which I'll fix (GUM_academic_lighting-20). The middle one is odd, probably because it contains an error. We either treat it as a main predicate (as it's annotated), or we claim the "in addition" is the predicate ("that x... is in addition"), but that doesn't look like what's intended. I think it's more like "in addition it is the case that..." but we are missing the arguments of "be"...