delph-in / erg

English Resource Grammar
MIT License
17 stars 3 forks source link

unknown predicate span over the whole sentence? #10

Closed arademaker closed 3 years ago

arademaker commented 5 years ago

Rocks included in the Mesaverde TPS crop out throughout much of the Uinta-Piceance Province (figure 6).

See the first predicate unknown<0:103> LBL: h1 ARG0: e2 ARG: x4 [ x PERS: 3 NUM: pl IND: + ] ] , is it a bug ? The result came from ACE ace-0.9.30 (MacOS) with the following command line:

$ ./ace -g erg-2018-osx-0.9.30.dat -n 1 -Tf

$ ./ace -g erg.dat -n 1 -Tf
[ LTOP: h0
INDEX: e2 [ e SF: prop ]
RELS: < [ unknown<0:103> LBL: h1 ARG0: e2 ARG: x4 [ x PERS: 3 NUM: pl IND: + ] ]
 [ udef_q<0:103> LBL: h5 ARG0: x4 RSTR: h6 BODY: h7 ]
 [ _rock_n_1<0:5> LBL: h8 ARG0: x4 ]
 [ _include_v_1<6:14> LBL: h8 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: i10 ARG2: x4 ]
 [ _in_p_state<15:17> LBL: h8 ARG0: e11 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: e9 ARG2: x12 [ x PERS: 3 NUM: sg IND: + ] ]
 [ _the_q<18:21> LBL: h13 ARG0: x12 RSTR: h14 BODY: h15 ]
 [ compound<22:40> LBL: h16 ARG0: e17 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x12 ARG2: x18 [ x PERS: 3 NUM: sg IND: + PT: notpro ] ]
 [ proper_q<22:35> LBL: h19 ARG0: x18 RSTR: h20 BODY: h21 ]
 [ compound<22:35> LBL: h22 ARG0: e23 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x18 ARG2: x24 [ x PERS: 3 NUM: sg IND: + PT: notpro ] ]
 [ proper_q<22:31> LBL: h25 ARG0: x24 RSTR: h26 BODY: h27 ]
 [ named<22:31> LBL: h28 CARG: "Mesaverde" ARG0: x24 ]
 [ named<32:35> LBL: h22 CARG: "TPS" ARG0: x18 ]
 [ _crop_n_1<36:40> LBL: h16 ARG0: x12 ]
 [ _out_p<41:44> LBL: h8 ARG0: e31 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x4 ]
 [ _throughout_p_state<45:55> LBL: h8 ARG0: e32 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: e31 ARG2: x33 [ x PERS: 3 NUM: sg ] ]
 [ part_of<56:60> LBL: h34 ARG0: x33 ARG1: x35 [ x PERS: 3 NUM: sg IND: + ] ]
 [ udef_q<56:60> LBL: h36 ARG0: x33 RSTR: h37 BODY: h38 ]
 [ much-many_a<56:60> LBL: h34 ARG0: e39 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x33 ]
 [ appos<64:103> LBL: h40 ARG0: e41 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x35 ARG2: x42 [ x PERS: 3 NUM: sg ] ]
 [ _the_q<64:67> LBL: h43 ARG0: x35 RSTR: h44 BODY: h45 ]
 [ compound<68:91> LBL: h46 ARG0: e47 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x35 ARG2: x48 [ x PERS: 3 NUM: sg IND: + PT: notpro ] ]
 [ proper_q<68:82> LBL: h49 ARG0: x48 RSTR: h50 BODY: h51 ]
 [ compound<68:82> LBL: h52 ARG0: e53 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x48 ARG2: x54 [ x PERS: 3 NUM: sg IND: + PT: notpro ] ]
 [ proper_q<68:82> LBL: h55 ARG0: x54 RSTR: h56 BODY: h57 ]
 [ named<68:82> LBL: h58 CARG: "Uinta-" ARG0: x54 ]
 [ named<68:82> LBL: h52 CARG: "Piceance" ARG0: x48 ]
 [ named<83:91> LBL: h46 CARG: "Province" ARG0: x35 ]
 [ number_q<92:103> LBL: h62 ARG0: x42 RSTR: h63 BODY: h64 ]
 [ compound<92:103> LBL: h65 ARG0: e66 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x42 ARG2: x67 [ x PERS: 3 NUM: sg PT: notpro ] ]
 [ udef_q<92:99> LBL: h68 ARG0: x67 RSTR: h69 BODY: h70 ]
 [ _figure_n_1<92:99> LBL: h71 ARG0: x67 ]
 [ card<100:103> LBL: h65 CARG: "6" ARG0: x42 ARG1: i73 ] >
HCONS: < h0 qeq h1 h6 qeq h8 h14 qeq h16 h20 qeq h22 h26 qeq h28 h37 qeq h34 h44 qeq h46 h50 qeq h52 h56 qeq h58 h63 qeq h65 h69 qeq h71 >
ICONS: < e9 topic x4 > ]
goodmami commented 5 years ago

I don't think it's a bug, but a missing lexical item or maybe just the wrong parse. In the sentence, "crop out" (or maybe just "crop" with "out" as a resultative or something) is the main verb, but the MRS uses _crop_n_1 instead of _crop_v_1 (a _crop_v_out predicate is not defined by the ERG). Therefore the unknown predicate serves as the main verb predicate. But as it is a null item it has no position and therefore has the span of the entire input.

Here's a simpler example, for the noun fragment "The sky". Note that it also has an unknown predicate spanning the whole input.

[ LTOP: h0
  INDEX: e2 [ e SF: prop-or-ques ]
  RELS: <
    [ unknown<0:7> LBL: h1 ARG0: e2 ARG: x4 [ x PERS: 3 NUM: sg ] ]
    [ _the_q<0:3> LBL: h5 ARG0: x4 RSTR: h6 BODY: h7 ]
    [ _sky_n_1<4:7> LBL: h8 ARG0: x4 ] >
  HCONS: < h0 qeq h1 h6 qeq h8 > ICONS: < > ]
danflick commented 5 years ago

Mike is right - the lexicon did not yet contain an entry for the verb-particle crop out', onlycrop up' and the transitive crop' as in |They cropped his hair|. Lacking that verb entry, the parser found an alternative analysis of the input as just an NP fragment, and the resulting MRS for such fragments includes thatunknown' predicate as a placeholder for the (presumably discourse-supplied) predicate of which the NP is an argument. I will add the missing entry to the lexicon to correct the behavior for this example, but you'll likely still see more fragment analyses than you might wish for as you parse interesting new text, with each such analysis introducing this `unknown' predicate.