goodmami / mrs-to-penman

Utilities for converting MRS data to the PENMAN serialization of DMRS
MIT License
2 stars 1 forks source link

update to last Pydelphin #4

Open arademaker opened 1 year ago

arademaker commented 1 year ago

This is related to #2

https://github.com/goodmami/mrs-to-penman/blob/5939e421cd789a11338ad0aa045d9830e49c33be/mrs_to_penman.py#L73-L101

Hi @goodmami , it looks like the code for read_profile can be replaced by:

def read_profile(f):
    p = itsdb.TestSuite(f)
    cur_id, cur_input, mrss = None, None, []

    for r in tsql.select('i-id i-input mrs', p):
        mrs = simplemrs.decode(r[2])

        if cur_id is None:
            cur_id = r[0]
            cur_input = r[1]

        if cur_id == r[0]:
            mrss.append(mrs)
        else:
            yield (cur_id, cur_input, mrss)
            cur_id, cur_input, mrss = r[0], r[1], [mrs]

    if mrss:
        yield (cur_id, cur_input, mrss)

Does it make sense? I didn't find in the current version of the gold profiles from ERG any reference to the p-results relation.

goodmami commented 1 year ago

@arademaker thanks, but really this repository is not code that I maintain. It is more a record of what was used for a previous experiment. See also https://github.com/goodmami/mrs-to-penman/issues/3#issuecomment-941161505. I should archive the repo to make that clear.

PyDelphin now has a native Penman codec (actually two, one for DMRS and another for EDS). I suggest you use those for the conversion.

arademaker commented 1 year ago

Indeed, I understood that. This repo could be identified as part of the https://github.com/shlurbee/dmrs-text-generation-naacl2019 and, as you said, just code to reproduce the paper. But I would appreciate your advice on which part of the code does what.

I am assuming that besides reading the profiles and transform the MRSs in DMRS, the code in this repo also deals with the linearization of the penman (figure 2 from the paper https://aclanthology.org/N19-1235.pdf) am I right?

But I didn't identified the code to deal with quotations and Wikipedia markup mentioned in the appendix of the paper. Maybe you were just reporting what you know people did for preparing the profiles part of the wikiwoods?

goodmami commented 1 year ago

This repo could be identified as part of the https://github.com/shlurbee/dmrs-text-generation-naacl2019

It is specified in setup.sh, line 12. The requirements.txt has PyDelphin at v0.6.2.

I am assuming that besides reading the profiles and transform the MRSs in DMRS, the code in this repo also deals with the linearization of the penman (figure 2 from the paper https://aclanthology.org/N19-1235.pdf) am I right?

Yes. More recent versions of PyDelphin have support for the conversion to PENMAN, but not in the same way as was done for this experiment.

But I didn't identified the code to deal with quotations and Wikipedia markup mentioned in the appendix of the paper.

I think that is here: https://github.com/shlurbee/dmrs-text-generation-naacl2019/blob/master/preprocessing.py