delph-in / pydelphin

Python libraries for DELPH-IN
https://pydelphin.readthedocs.io/
MIT License
79 stars 27 forks source link

intrinsic variables #340

Closed arademaker closed 2 years ago

arademaker commented 2 years ago

Given

live so as to annul some previous behavior;

Why am I getting _17 as id for the 3rd predicate? Indeed, this MRS has has_intrinsic_variable_property as false precisely because unknown and _so+as+to_x have equal ARG0.

I just need to confirm if _17 was created to generate uniques identifiers for each predication, is that the case?

> print(simplemrs.encode(rs[r], indent=True))
[ TOP: h0
  INDEX: e2 [ e SF: prop-or-ques ]
  RELS: < [ unknown<0:43> LBL: h4 ARG: e5 [ e SF: prop ] ARG0: e2 ]
          [ _live_a_1<0:4> LBL: h4 ARG0: e5 ARG1: u6 ]
          [ _so+as+to_x<5:10> LBL: h1 ARG0: e2 ARG1: h7 ARG2: h8 ]
          [ _annul_v_1<14:19> LBL: h9 ARG0: e10 [ e SF: prop-or-ques TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: i11 ARG2: x12 [ x PERS: 3 NUM: sg ] ]
          [ _some_q<20:24> LBL: h13 ARG0: x12 RSTR: h14 BODY: h15 ]
          [ _previous_a_1<25:33> LBL: h16 ARG0: e17 [ e SF: prop TENSE: untensed MOOD: indicative PROG: bool PERF: - ] ARG1: x12 ]
          [ _behavior_n_1<34:42> LBL: h16 ARG0: x12 ] >
  HCONS: < h0 qeq h1 h7 qeq h4 h8 qeq h9 h14 qeq h16 > ]

> d = dmrs.from_mrs(rs[r])
> print([n.id for n in d.nodes])
[10000, 10001, 10002, 10003, 10004, 10005, 10006]

> print([(n.id,n.args) for n in rs[r].predications])
[('e2', {'ARG0': 'e2', 'ARG': 'e5'}), ('e5', {'ARG0': 'e5', 'ARG1': 'u6'}), ('_17', {'ARG0': 'e2', 'ARG1': 'h7', 'ARG2': 'h8'}), ('e10', {'ARG0': 'e10', 'ARG1': 'i11', 'ARG2': 'x12'}), ('q12', {'ARG0': 'x12', 'RSTR': 'h14', 'BODY': 'h15'}), ('e17', {'ARG0': 'e17', 'ARG1': 'x12'}), ('x12', {'ARG0': 'x12'})]

Finally, here we have

Each quantifier should have an ARG0 that is the intrinsic variable of exactly one non-quantifier EP, but this function does not check for that.

Why the function is not checking for that? Is this restriction part of the definition of the intrinsic variable property? Is that a useful functionality to be added in the library?

arademaker commented 2 years ago

OK, I found the answer to my first question:

I just need to confirm if _17 was created to generate uniques identifiers for each predication, is that the case?

I found _uniquify_ids in the Mrs module.

arademaker commented 2 years ago

Making the unique identifiers is a shortcut for transforming an MRS that does not have the intrinsic variable property (IVP) into an MRS with that property? I guess it depends on how those ids are used to generate the DMRS, right? Because the IVP is defined for the transformation to DMRS, right?

goodmami commented 2 years ago

It looks like you found the answer to your first question. The EP.id property is not an intrinsic variable but a unique identifier.

>>> from delphin.codecs import simplemrs
>>> m = simplemrs.decode('''
... [ TOP: h0
...   INDEX: e2 [ e SF: prop-or-ques ]
...   RELS: < [ unknown<0:43> LBL: h4 ARG: e5 [ e SF: prop ] ARG0: e2 ]
...           [ _live_a_1<0:4> LBL: h4 ARG0: e5 ARG1: u6 ]
...           [ _so+as+to_x<5:10> LBL: h1 ARG0: e2 ARG1: h7 ARG2: h8 ]
...           [ _annul_v_1<14:19> LBL: h9 ARG0: e10 [ e SF: prop-or-ques TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: i11 ARG2: x12 [ x PERS: 3 NUM: sg ] ]
...           [ _some_q<20:24> LBL: h13 ARG0: x12 RSTR: h14 BODY: h15 ]
...           [ _previous_a_1<25:33> LBL: h16 ARG0: e17 [ e SF: prop TENSE: untensed MOOD: indicative PROG: bool PERF: - ] ARG1: x12 ]
...           [ _behavior_n_1<34:42> LBL: h16 ARG0: x12 ] >
...   HCONS: < h0 qeq h1 h7 qeq h4 h8 qeq h9 h14 qeq h16 > ]''')
>>> m.predications[2]
<EP object (h1:_so+as+to_x(ARG0 e2, ARG1 h7, ARG2 h8)) at 140326842476736>
>>> m.predications[2].id
'_17'
>>> m.predications[2].iv
'e2'

So, yes, MRSs that don't have unique intrinsic variables on non-quantifier EPs will have EPs with ids like _N. But note that quantifier EP ids are also different from their bound variables, even on MRSs that have the intrinsic variable property:

>>> m.predications[4].id
'q12'
>>> m.predications[4].iv
'x12'

The purpose of the unique ids is to support mappings in the representation. For instance, MRS.arguments uses the EP ids as the keys:

>>> m.arguments()['_17']
[('ARG1', 'h7'), ('ARG2', 'h8')]

Making the unique identifiers is a shortcut for transforming an MRS that does not have the intrinsic variable property (IVP) into an MRS with that property? I guess it depends on how those ids are used to generate the DMRS, right? Because the IVP is defined for the transformation to DMRS, right?

I don't recall exactly, but I think the uniqueness of ids may help in MRS-to-DMRS conversion when it cannot rely on unique intrinsic variables.

Why the function is not checking for that? Is this restriction part of the definition of the intrinsic variable property? Is that a useful functionality to be added in the library?

Hmm, I don't recall. Either it was too expensive to compute for relatively little gain, or we're already checking it elsewhere, or something else.

oepen commented 2 years ago

I don't recall exactly, but I think the uniqueness of ids may help in MRS-to-DMRS conversion when it cannot rely on unique intrinsic variables.

i suspect you may hva picked this up from the EDS universe at some point, to allow conversion of MRSs that do not present unique intrinsic variables; se eds-uniq-ids():

http://svn.delph-in.net/lkb/trunk/src/mrs/dependencies.lisp

historically, the EDS design has emphasized robustness more than DMRS.

goodmami commented 2 years ago

@oepen yes, good catch. While for some cases having unique IDs separate from the intrinsic variables was the obvious solution, in the case of conversion to dependency representations I recall taking inspiration from EDS.

goodmami commented 2 years ago

I will close this issue as it looks like the question was answered.