delph-in / pydelphin

Python libraries for DELPH-IN
https://pydelphin.readthedocs.io/
MIT License
79 stars 27 forks source link

Exporting EDS, MRS and DM from profiles #299

Closed arademaker closed 4 years ago

arademaker commented 4 years ago

Do we have in pydelphin functions to export EDS, MRS and DM from profiles?

goodmami commented 4 years ago
arademaker commented 4 years ago

So I am missing something. Stephan said in http://lists.delph-in.net/archives/developers/2020/003077.html that the Lisp code in the LOGON distribution needs the grammar to export the EDS. But you didn't use the grammar in the above commands, right?

For DM, I will read #122, thank you. For labeled trees do we also need some grammar files, right?

goodmami commented 4 years ago

the Lisp code in the LOGON distribution needs the grammar to export the EDS

Stephan made use of a number of ERG features to produce his EDSs from MRS, and I think the patterns for these were defined as part of the grammar. It doesn't actually need things like the type hierarchy or lexicon for this, just the list of patterns in lkb/eds.lsp to help compute the intuitive "representative" nodes. For instance, it blocks the (now outdated) discourse predicates like parg_d from becoming a representative.

PyDelphin's MRS-to-EDS conversion does not require a grammar or such a list. I was able to very nearly match the outputs of the LKB code looking only at graph properties and variable types. I resolved most or all differences with the LKB by admitting one grammar-specific feature: the TENSE morphosemantic property. I surveyed the large- and medium-sized grammars (see this message) and found that most grammars do not use the property, but the ERG, Jacy, gg, and SRG did. Since most people just use the ERG, I don't expect this to cause many issues in practice, even if it goes against PyDelphin's first point in its development philosophy.

See also:

DM, however, requires the lexicon and maybe other things for conversion. I'm not sure if PyDelphin's current level of TDL parsing is sufficient for this, but I hope so.

For labeled trees do we also need some grammar files, right?

Yes, and not just a superficial reading of the grammar, because it relies on unification to determine the appropriate labels. For this reason, PyDelphin always asks ACE to --report-labels when it parses:

https://github.com/delph-in/pydelphin/blob/b94007341dbd0fd35f82c543d45cdab5dc04cc7f/delphin/ace.py#L99-L100

Otherwise, there's not currently a good way of getting labeled trees from derivations with PyDelphin.

arademaker commented 4 years ago

I don't recall if LOGON's export prints other stuff, like item ids or derivations.

What would be the item ids?

Stephan made use of a number of ERG features to produce his EDSs from MRS, and I think the patterns for these were defined as part of the grammar. It doesn't actually need things like the type hierarchy or lexicon for this, just the list of patterns in lkb/eds.lsp to help compute the intuitive "representative" nodes. For instance, it blocks the (now outdated) discourse predicates like parg_d from becoming a representative.

I am afraid I didn't understand this paragraph. Sorry, but can you elaborate on it a little bit more?

Regarding the TENSE morphosemantic property, if I understood you comment, the weakening point for PyDelphin is that it counts on having this property for functions like the scope.representative, am I right?

goodmami commented 4 years ago

What would be the item ids?

The i-id fields in the item file.

can you elaborate on it a little bit more?

The dependency representations EDS and DMRS (and DM by extension) rely on selecting for each scopal argument (which is effectively a hyperedge) one representative EP from the set of EPs in the scope. The LKB code for EDS uses some special patterns for this which may include specific ERG predicate names, etc. PyDelphin's code achieves (very nearly) the same result without the grammar-specific patterns, but it had to compromise on the TENSE property for a class of issues that couldn't be resolved looking at graph properties alone. This means that, in principle, PyDelphin's code can more readily convert to EDS or DMRS for non-ERG grammars and still produce the expected representations. The TENSE property is one of the last comparisons made, so even in grammars that don't have the property, PyDelphin should do something reasonable.

goodmami commented 4 years ago

Closing as I think the question has been answered. Please follow #122 if you're interested in converting to DM.

arademaker commented 4 years ago

thank you very much for the explanation.