delph-in / pydelphin

Python libraries for DELPH-IN
https://pydelphin.readthedocs.io/
MIT License
79 stars 27 forks source link

why EDS nodes have types? #311

Closed arademaker closed 4 years ago

arademaker commented 4 years ago
>>> from delphin import ace
>>> grm = 'erg.dat'
>>> response = ace.parse(grm, 'The dog barks.')
NOTE: parsed 1 / 1 sentences, avg 1733k, time 1.20498s
>>> m = response.result(0).mrs()
>>> from delphin import dmrs, eds
>>> from delphin.codecs import simplemrs, mrx
>>> from delphin.codecs import eds as edsnative
>>> e = eds.from_mrs(m)
>>> [x.type for x in e.nodes]
[None, 'x', 'e']

According to the discussion in https://delphinqa.ling.washington.edu/t/node-identifiers-of-quantifiers-in-eds/528, nodes in EDS are identifiers and no semantics should be associated to the prefix x, e, i... nor to the numbers. Right?

oepen commented 4 years ago

i was not party to the discourse thread you reference, but saying that ‘nodes are identifiers’ sounds misleading to me. EDS nodes are abstract, structured entities. when referring to them, e.g. in serialization, we can associate a unique identifier to each node. except for unique reference, there is no content in the identifiers; for three nodes, they could be anything like {‘42’, ‘47’, ‘11’}, {‘X’, ‘Y’, ‘Z’}, or {‘x1’, ‘x2’, ‘e1’}. uniquely substituting identifiers does not change the graph, or anything.

EDS nodes bear content: (virtually) all are decorated with a predicate (the node label, in MRP terms) and a link (anchor in MRP) into the underlying surface string (typically a contiguous character range for most DELPH-IN grammars; at times called characterization). some nodes carry additional properties, e.g. a ‘type’ distinguishing nodes corresponding to eventualities from ones corresponding to instances; a constant argument (‘CARG’, on named entities); or properties like ‘TENSE’ or ‘NUM’, which spell out node-local morpho-semantic information.

one can inspect the node properties by mousing over the node identifiers in the EDS visualization of the on-line ERG demonstrator. i further would like to assume that the above is consistent with the EDS specification.

goodmami commented 4 years ago

Thanks for the info, @oepen! Note that you don't need to log into the Discourse site to read the linked thread. Alexandre is referring to the identifiers only.

@arademaker to digest what Stephan said a bit and relate it to the situation causing your surprise, you are correct that the node identifiers bear no meaning except in their role as identifiers (for linking nodes, etc.). The type you're seeing is not coming from the identifier but from the type property of a node. You can see this if you serialize the EDS:

>>> print(edsnative.encode(e, indent=True))
{e2:
 _1:_the_q<0:3>[BV x3]
 x3:_dog_n_1<4:7>{x PERS 3, NUM sg, IND +}[]
 e2:_bark_v_1<8:14>{e SF prop, TENSE pres, MOOD indicative, PROG -, PERF -}[ARG1 x3]
}

Note the property list on, e.g., x3, which is {x PERS 3, NUM sg, IND +}. The first symbol x is the node's type. On the quantifier there is no property list, so its type is None, not _.

arademaker commented 4 years ago

Thank you @oepen for the detailed explanation and to correct me. Sorry for the mistake. I was actually talking about the nodes identifiers and not about the abstract entities, the nodes themselves.

Thank you @goodmami , clear now that the types of variables in MRS are projected to be properties of the EDS nodes.

I need to read the paper describing the EDS and the transformation from MRS to EDS.