UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
199 stars 42 forks source link

Possible misannotated coordination #500

Closed nikitakit closed 9 months ago

nikitakit commented 9 months ago

I suspect there may be an annotation error for the following sentence. PUCT (=Public Utility Commission of Texas?) and ERCOT (=Electric Reliability Council of Texas?) are joined by "and", and I'm confused about where the arc to viewpoints is coming from.

I'm not confident enough in my UD guideline knowledge to be sure of the correct fix, including the correct enhanced deps in the last field.

This came to my attention because it is the only sentence in EWT that fails to meet a certain definition of well-nestedness for dependency trees.

# sent_id = email-enronsent39_01-0078
# text = Enron needs to use this situation to quickly get our viewpoints up into the PUCT and ERCOT ISO on what is driving these problems and our proposed fixes.
1   Enron   Enron   PROPN   NNP Number=Sing 2   nsubj   2:nsubj|4:nsubj:xsubj   _
2   needs   need    VERB    VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   0   root    0:root  _
3   to  to  PART    TO  _   4   mark    4:mark  _
4   use use VERB    VB  VerbForm=Inf    2   xcomp   2:xcomp _
5   this    this    DET DT  Number=Sing|PronType=Dem    6   det 6:det   _
6   situation   situation   NOUN    NN  Number=Sing 4   obj 4:obj   _
7   to  to  PART    TO  _   9   mark    9:mark  _
8   quickly quickly ADV RB  _   9   advmod  9:advmod    _
9   get get VERB    VB  VerbForm=Inf    4   advcl   4:advcl:to  _
10  our our PRON    PRP$    Case=Gen|Number=Plur|Person=1|Poss=Yes|PronType=Prs 11  nmod:poss   11:nmod:poss    _
11  viewpoints  viewpoint   NOUN    NNS Number=Plur 9   obj 9:obj   _
12  up  up  ADV RB  _   9   advmod  9:advmod    _
13  into    into    ADP IN  _   18  case    18:case _
14  the the DET DT  Definite=Def|PronType=Art   18  det 18:det  _
15  PUCT    PUCT    PROPN   NNP Number=Sing 18  compound    18:compound _
16  and and CCONJ   CC  _   17  cc  17:cc   _
-17 ERCOT   ERCOT   PROPN   NNP Number=Sing 11  conj    9:obj|11:conj:and   _
+17 ERCOT   ERCOT   PROPN   NNP Number=Sing 15  conj    9:obj|11:conj:and   _
18  ISO iso NOUN    NN  Number=Sing 9   obl 9:obl:into  _
19  on  on  SCONJ   IN  _   22  mark    22:mark _
20  what    what    PRON    WP  PronType=Int    22  nsubj   22:nsubj    _
21  is  be  AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   22  aux 22:aux  _
22  driving drive   VERB    VBG Tense=Pres|VerbForm=Part    11  acl 11:acl:on   _
23  these   this    DET DT  Number=Plur|PronType=Dem    24  det 24:det  _
24  problems    problem NOUN    NNS Number=Plur 22  obj 22:obj  _
25  and and CCONJ   CC  _   28  cc  28:cc   _
26  our our PRON    PRP$    Case=Gen|Number=Plur|Person=1|Poss=Yes|PronType=Prs 28  nmod:poss   28:nmod:poss    _
27  proposed    propose VERB    VBN Tense=Past|VerbForm=Part|Voice=Pass 28  amod    28:amod _
28  fixes   fix NOUN    NNS Number=Plur 22  conj    11:obl:on|22:conj:and   SpaceAfter=No
29  .   .   PUNCT   .   _   2   punct   2:punct _
nschneid commented 9 months ago

Based on https://electricityplans.com/texas/puct-public-utility-commission-of-texas/, it appears that ERCOT is an ISO. I suspect this should be bracketed as [PUCT and [ERCOT ISO]], but right now it is a nonsensical tree. Will fix.

nschneid commented 9 months ago

Before:

image

After:

image
nikitakit commented 9 months ago

Looks like yours is the right way of parsing the relationship between these Texas utility acronyms. The issue is fixed. Thanks!