delph-in / erg

English Resource Grammar
MIT License
17 stars 3 forks source link

"That will be all, thank you." generates invalid MRS #50

Closed EricZinda closed 6 months ago

EricZinda commented 6 months ago

In ERG 2020, the phrase "That will be all, thank you." generates 15 parses (below). The MRS in parse 6, 7 and 10 is misformed in that there are too few holes for the handles which must be assigned when building a well-formed MRS. I suspect the problem is that implicit_conj should have the same handle as the predications is it joining in ARG1 and ARG2:

Parse 0: implicit_conj(e2,e4,e5), generic_entity(x7), _that_q_dem(x7,h9,h10), _be_v_id(e4,x7,x11), _all_q(x11,h13,h14), generic_entity(x11), pronoun_q(x17,h18,h19), pron(x17), _thank+you_v_1(e5,x17)

Parse 1: implicit_conj(e2,e4,e5), generic_entity(x7), _that_q_dem(x7,h9,h10), _be_v_id(e4,x7,x11), _all_q(x11,h13,h14), generic_entity(x11), pronoun_q(x17,h18,h19), pron(x17), _thank_v_1(e5,x17,x21), pron(x21), pronoun_q(x21,h24,h25)

Parse 2: generic_entity(x3), _that_q_dem(x3,h6,h7), implicit_conj(e2,e8,e9), _be_v_id(e8,x3,x11), _all_q(x11,h13,h14), generic_entity(x11), _thank_v_1(e9,x3,x16), pron(x16), pronoun_q(x16,h19,h20)

Parse 3: generic_entity(x3), _that_q_dem(x3,h6,h7), _be_v_nv(e8,x3,h9), _all_q(x11,h12,h13), generic_entity(x11), _thank_v_1(e15,x11,x16), pron(x16), pronoun_q(x16,h19,h20)

Parse 4: generic_entity(x5), _that_q_dem(x5,h7,h8), _be_v_id(e10,x5,x11), _all_q(x11,h13,h14), generic_entity(x11), pronoun_q(x3,h17,h18), pron(x3), _thank_v_1(e2,x3,x20,h21), pron(x20), pronoun_q(x20,h24,h25)

Parse 5: _be_v_do(e2,x4,h5), _all_a_1(i7,e8), _thank_v_for(e8,i9,x10,i11), pron(x10), pronoun_q(x10,h14,h15)

Parse 6: implicit_conj(e2,e4,e5), _be_v_id(e4,i7,x8), _all_q(x8,h10,h11), generic_entity(x8), _thank_v_1(e5,i7,x13), pron(x13), pronoun_q(x13,h16,h17) GRAMMAR ERROR: Holes = 5 and Floaters = 6

Parse 7: implicit_conj(e2,e4,e5), _be_v_id(e4,x7,x8), _all_q(x8,h10,h11), generic_entity(x8), _thank_v_for(e5,x7,x13,i14), pron(x13), pronoun_q(x13,h17,h18) GRAMMAR ERROR: Holes = 5 and Floaters = 6

Parse 8: _be_v_do(e2,x4,h5), _all_a_1(i7,e8), _thank_v_1(e8,i9,x10), pron(x10), pronoun_q(x10,h13,h14)

Parse 9: _be_v_do(e2,x4,h5), _all_a_1(i7,e8), _thank_v_for(e8,i9,x10,i11), pron(x10), pronoun_q(x10,h14,h15)

Parse 10: implicit_conj(e2,e4,e5), _be_v_id(e4,i7,x8), _all_q(x8,h10,h11), generic_entity(x8), _thank_v_1(e5,i7,x13), pron(x13), pronoun_q(x13,h16,h17) GRAMMAR ERROR: Holes = 5 and Floaters = 6

Parse 11: _be_v_do(e2,x4,h5), _all_a_1(i7,e8), _thank_v_1(e8,i9,x10), pron(x10), pronoun_q(x10,h13,h14)

Parse 12: _be_v_do(e2,x4,h5), udef_q(x7,h8,h9), nominalization(x7,h11), _all_a_1(i13,e14), _thank_v_1(e14,x7,x15), pron(x15), pronoun_q(x15,h18,h19)

Parse 13: _be_v_do(e2,x4,h5), udef_q(x7,h8,h9), nominalization(x7,h11), _all_a_1(i13,e14), _thank_v_1(e14,x7,x15), pron(x15), pronoun_q(x15,h18,h19)

Parse 14: _be_v_nv(e4,i5,h6), _all_q(x8,h9,h10), generic_entity(x8), _thank_v_1(e12,x8,x13), pron(x13), pronoun_q(x13,h16,h17)

arademaker commented 6 months ago

I don't know if @danflick is paying attention to the issues here. We also have http://delphinqa.ling.washington.edu but I agree with you that this can be the best place to open issues for questions related to ERG.

@EricZinda,

I suspect the problem is that implicit_conj should have the same handle as the predications is it joining in ARG1 and ARG2

This is the first MRS I got with the trunk version of ERG. I just removed the properties of the variables e8 and e9 to simplify the presentation:

[ LTOP: h0
INDEX: e2 [ e SF: prop TENSE: fut MOOD: indicative PROG: - PERF: - ]
RELS: < [ generic_entity<0:4>   LBL: h4 ARG0: x3 [ x PERS: 3 NUM: sg GEND: n ] ]
 [ _that_q_dem<0:4>     LBL: h5 ARG0: x3 RSTR: h6 BODY: h7 ]
 [ implicit_conj<10:28>     LBL: h1 ARG0: e2 ARG1: e8  ARG2: e9 ]
 [ _be_v_id<10:12>      LBL: h1 ARG0: e8 ARG1: x3 ARG2: x10 [ x PERS: 3 ] ]
 [ _all_q<13:16>        LBL: h11 ARG0: x10 RSTR: h12 BODY: h13 ]
 [ generic_entity<13:16>    LBL: h14 ARG0: x10 ]
 [ _thank_v_1<18:23>        LBL: h1 ARG0: e9 ARG1: x3 ARG2: x15 [ x PERS: 2 IND: + PT: std ] ]
 [ pron<24:27>          LBL: h16 ARG0: x15 ]
 [ pronoun_q<24:27>     LBL: h17 ARG0: x15 RSTR: h18 BODY: h19 ] >
HCONS: < h0 qeq h1 h6 qeq h4 h12 qeq h14 h18 qeq h16 >
ICONS: < > ]

So implicit_conj shares the label h1 with _be_v_id and _thank_v_1!

I got 11 analyses. All of them passed in the checks for the qeq constraints I implemented in https://github.com/arademaker/delphin/blob/main/Mrs/Resolver.lean#L42-L52 based on the LKB code for the HCONS validation.

For this same first MRS, the Utool was able to find 6 possible solutions:

image

But I haven't checked all the other 10 readings with Utool. My code for resolving the MRS scopes is still incomplete, I hope to finish it soon. I the mean time, using PyDelphin I checked all 11 MRSs:

>>> from delphin import ace
>>> response = ace.parse("../erg.dat", 'That will be all, thank you.')
NOTE: parsed 1 / 1 sentences, avg 12048k, time 0.58325s
>>> for r in response.results():
...     m = r.mrs()
...     (mrs.is_well_formed(m), mrs.plausibly_scopes(m))
... 
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)
(True, True)

I suggest you try ERG trunk version, see link at https://github.com/delph-in/docs/wiki/ErgTop

arademaker commented 6 months ago

Oh... indeed, your results are still relevant if @danflick wants to make any possible necessary fix in the ERG 2020 version.

EricZinda commented 6 months ago

Thanks for checking this phrase in the ERG trunk version, @arademaker! I'll have to do a build with the trunk version and see if it works for all my tests. Do you know when the next official build (i.e. the successor to 2020) is targeted for?

arademaker commented 6 months ago

I don't know. I hope @danflick can answer that. By the way, it would also be nice if @danflick could confirm if any change from version 2020 to the current trunk can explain the change in the behavior you reported here.

danflick commented 6 months ago

Sorry for the slow response. First, I agree that this is a good forum for reporting issues with the ERG, including bugs and feature requests, and we can hope for better response times in the future. As for the invalid MRSs from what I assume was the 2020 version, I do think, as @arademaker noted, that the grammar bug(s) got fixed in later updates. I am happy to announce that the latest stable version ERG 2023 is available via SVN, and I have belatedly edited the ErgTop page to communicate its availability. I still hope to soon complete the setup on GitHub so that this and future releases can be picked up via this more modern vehicle, and will announce that once it's ready, but please use the SVN channel for now.

arademaker commented 6 months ago

So, I will close this issue. It was solved in the last release.