delph-in / pydelphin

Python libraries for DELPH-IN
https://pydelphin.readthedocs.io/
MIT License
79 stars 27 forks source link

Invalid qeqs cause DMRS/EDS conversion to raise KeyError #303

Closed goodmami closed 4 years ago

goodmami commented 4 years ago

Currently a bug in MRS-to-DMRS (or EDS) conversion causes a KeyError when the lo handle of a qeq is not the label of some EP. This sometimes happens when the TOP handle h0 is qeq to h1 but h1 is not the label of anything. In the following example (item 1150 from the ERG's handp12 profile), the qeq from the scopal argument of neg is not the LBL of any EP:

[ TOP: h0
  INDEX: e2 [ e SF: prop TENSE: past MOOD: indicative ]
  RELS: < [ neg<3:9> LBL: h1 ARG0: e4 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: h5 ]
          [ _for_p<10:13> LBL: h1 ARG0: e6 [ e SF: prop TENSE: untensed MOOD: indicative ] ARG1: u7 ARG2: x8 [ x PERS: 1 NUM: sg IND: + PT: std ] ]
          [ pron<14:16> LBL: h9 ARG0: x8 ]
          [ pronoun_q<14:16> LBL: h10 ARG0: x8 RSTR: h11 BODY: h12 ]
          [ loc_nonsp<17:21> LBL: h1 ARG0: i13 ARG1: e2 ARG2: e6 ]
          [ pron<22:24> LBL: h14 ARG0: x15 [ x PERS: 3 NUM: sg GEND: m IND: + PT: std ] ]
          [ pronoun_q<22:24> LBL: h16 ARG0: x15 RSTR: h17 BODY: h18 ]
          [ _make_v_1<25:29> LBL: h1 ARG0: e2 ARG1: x15 ARG2: x19 [ x PERS: 3 NUM: sg PT: pt ] ]
          [ _the_q<30:33> LBL: h20 ARG0: x19 RSTR: h21 BODY: h22 ]
          [ _sacrifice_n_1<34:43> LBL: h23 ARG0: x19 ] >
  HCONS: < h0 qeq h1 h5 qeq h24 h11 qeq h9 h17 qeq h14 h21 qeq h23 >
  ICONS: < e2 focus e6 > ]

While these are bad MRSs and there might not be a reasonable way to recover from these errors, the error message could be more useful to the user. And processes like delphin convert should be able to detect the error and move on instead of stopping altogether.

goodmami commented 4 years ago

A reasonable solution may be to issue a warning and otherwise ignore it (that is, set DMRS.top to None or drop the link). This way one might still be able to get something useful out of the (possibly disconnected) DMRS, a conversion process wouldn't terminate mid-way through, and the user can still be aware of the problem from the warning.