danielhers / tupa

Transition-based UCCA Parser
https://danielhers.github.io/tupa
GNU General Public License v3.0
72 stars 24 forks source link

UCCA-XML to Oracle Sequence: Exception Instances #69

Closed ghost closed 5 years ago

ghost commented 5 years ago

I tried to use my modified edition of the file tupa/test_oracle.py to convert UCCA input of this competition to chains of oracles. I was able to work on most inputs, but failed on the file <root>/dev-xml/UCCA_English-Wiki/705006.xml in their "public" dataset.

The output message shows:

Traceback (most recent call last):
  File "passage2oracles.py", line 86, in <module>
    produce_oracle(filename)
  File "passage2oracles.py", line 78, in produce_oracle
    for i, action in enumerate(gen_actions(passage)):
  File "passage2oracles.py", line 57, in gen_actions
    action = min(oracle.get_actions(state, actions).values(), key=str)
  File "/Users/yanyang/Spring19/DRL_UCCA/tupa/oracle.py", line 69, in get_actions
    assert actions, self.generate_log(invalid, state)
AssertionError: Oracle found no valid action
stack: [1.1 1.3 1.40        ]
buffer: ["," "would" "not" "play" "any" "of" "his" "older" "," "secular" "works" "," "and" "he" "delivered" "declarations" "of" "his" "faith" "from" "the" "stage" "," "such" "as" ":"]
nodes left: [1.16 1.18 1.56 1.4 1.6 1.13 1.26 1.28 1.14 1.27 1.57 1.25 1.33 1.31 1.20 1.36 1.29 1.32 1.21 1.7 1.15 1.55 1.30 1.58 1.54 1.35 1.19 1.22 1.23 1.5 1.24 1.17]
edges left: [1.16->1.19 1.6->1.13 1.15->1.13 1.23->0.28 1.6->1.14 1.6->1.15 1.16->1.17 1.24->0.29 1.33->0.21 1.13->0.24 1.6->1.16 1.18->0.31 1.21->1.24 1.16->1.18 1.7->0.34 1.15->1.21 1.22->0.27 1.21->1.23 1.7->0.35 1.20->0.26 1.19->0.32 1.32->1.36 1.17->0.30 1.36->0.20 1.21->1.22 1.32->1.55 1.55->0.19 1.29->0.15 1.28->1.32 1.28->1.33 1.15->1.20 1.1->1.58 1.4->1.25 1.26->0.13 1.54->0.11 1.28->1.30 1.1->1.57 1.1->1.5 1.4->1.28 1.1->1.56 1.57->0.33 1.27->0.14 1.56->0.22 1.25->0.12 1.28->1.31 1.14->0.25 1.1->1.4 1.35->0.18 1.1->1.6 1.4->1.26 1.32->1.35 1.58->0.36 1.3->1.40 1.1->1.54 1.30->0.16 1.1->1.7 1.5->0.23 1.28->1.29 1.31->0.17 1.4->1.27]
Actions returned by the oracle:
  RIGHT-REMOTE-A: 1.40 is already 1.3's child
ghost commented 5 years ago

So is <root>/dev-xml/UCCA_English-Wiki/705008.xml. There're only two sentences that can't be converted among all UCCA English Wiki data.

danielhers commented 5 years ago

We fixed these two issues after the shared task: https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-Wiki/commit/ed13dee6b4a089dc5755da07f6085902f6c0b4ae#diff-05d5cce54c47b675f644021ad0adc51a https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-Wiki/commit/61560134ab292e77a7eb337a1952f3e4cd16d8d2#diff-05d5cce54c47b675f644021ad0adc51a

To avoid failures for invalid input passages, use --no-validate-oracle: https://github.com/danielhers/tupa/blob/master/tupa/oracle.py#L63

Let me know if it works.

ghost commented 5 years ago

Deleting the 3 lines works. Thank you!