delph-in / erg

English Resource Grammar
MIT License
17 stars 3 forks source link

*top* used as argument value for unknowns predicted as adjectives #4

Closed goodmami closed 5 years ago

goodmami commented 6 years ago

When an unknown is predicted to be an adjective, the adjective's ARG1 is given the value *top* instead of, say, a u or i variable.

goodmami@tpy:~/grammars$ ace -g erg-1214-x86-64-0.9.25.dat -Tq <<< "Garbage harbors vermin."
[ LTOP: h0 INDEX: e2 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] RELS: < [ unknown<0:23> LBL: h1 ARG0: e2 ARG: x4 [ x PERS: 3 NUM: pl IND: + ] ]  [ udef_q<0:15> LBL: h5 ARG0: x4 RSTR: h6 BODY: h7 ]  [ compound<0:15> LBL: h8 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x4 ARG2: x10 [ x PERS: 3 NUM: sg IND: - ] ]  [ udef_q<0:7> LBL: h11 ARG0: x10 RSTR: h12 BODY: h13 ]  [ _garbage_n_1<0:7> LBL: h14 ARG0: x10 ]  [ _harbor_n_1<8:15> LBL: h8 ARG0: x4 ]  [ _vermin/JJ_u_unknown<16:23> LBL: h1 ARG0: e15 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: *top* ] > HCONS: < h0 qeq h1 h6 qeq h8 h12 qeq h14 > ]

This appears to be the case with ACE and not with the LKB, so it might be an ACE bug, or perhaps this is another situation where the LKB is making up for a deficiency in the grammar.

I found at least 3 sentences in the English side of the Tanaka corpus that caused this problem:

danflick commented 6 years ago

Mike, I found the bug at the source of this misbehavior: when the attribute ARG1 is introduced in ERG's fundamentals.tdl, its value is wrongly left as top, so if nothing ever constrains it further, that's how it will emerge in the parse's feature structure, and the VPM statements also do not force it to be anything else. All of the other role attributes are rightly constrained to be of maximal type semarg', and ARG1 should be too, both in 1214 and in trunk. I remember why the value was made *top* once upon a time: we were experimenting with making the ARG1 of degree specifiers of quantifiers (as in "nearly every") be a string provided by the quantifier. We repented of this idea, but apparently never went back to constrain ARG1's maximal value to besemarg'. I also went through the treebanks for the trunk ERG, and did not find any other instances of top in MRSs besides the ARG1 ones. As a patch, one can modify the fundamentals.tdl file in the definition of `basic_arg01_relation' to constrain the value of ARG1 to be semarg. It seems unlikely that we will change the 1214 code base, which has now long been frozen, but I have made the correction in the trunk version of the ERG.

goodmami commented 6 years ago

Thanks, Dan, for investigating and fixing. No worries about changing 1214; I'm already using a modified version of 1214 (with SEM-I additions for JaEn), so I can modify fundamentals.tdl locally and put a note in the appendix of my thesis.

danflick commented 5 years ago

There are now no more top values in MRSs for the gold profiles for the 2018 version, so let's hope that now all semantic arguments are at least constrained to "u".