delph-in / pydmrs

A library for manipulating DMRS structures
MIT License
14 stars 6 forks source link

Undirected EQ links #11

Open guyemerson opened 8 years ago

guyemerson commented 8 years ago

Store undirected EQ links in a separate list/dict. When reading from XML, we have to check which links this applies to.

In the longer term, we might be able to justify a directionality.

Example: the dog whose tail wagged barked There is a an EQ link between bark and dog. (We might say that it is from bark to dog, but we won't do this for now.)

guyemerson commented 8 years ago

The XML format may record an undirected EQ link as having NIL as its rargname.

emm68 commented 8 years ago

Can we return the EQ links only when calling get_links and get_neighbouring _nodes, but not when calling get_in/get_out or get_in_node/get_out_nodes? Right now, especially with the nodes, there is no good way of filtering the ones from EQ links and arguably, they're not really 'in' or 'out'.

AlexKuhnle commented 8 years ago

If so, I would rather include them in both, otherwise things like neighbour nodes etc all get changed. But I think the directionality of EQ links Ann always mentions is something that is probably coming relatively soon, and then it makes sense. Would it work for you to just filter the resulting links accordingly?

emm68 commented 8 years ago

They are in both at the moment. Purely theoretical considerations aside, the problem is mostly with get_in/out_nodes because it's not straightforward there to filter out the ones from EQ links afterwards.

On Fri, 18 Mar 2016, 16:35 Alex Kuhnle, notifications@github.com wrote:

If so, I would rather include them in both, otherwise things like neighbour nodes etc all get changed. But I think the directionality of EQ links Ann always mentions is something that is probably coming relatively soon, and then it makes sense. Would it work for you to just filter the resulting links accordingly?

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/delph-in/pydmrs/issues/11#issuecomment-198442742

emm68 commented 8 years ago

Possibly, could we add a parameter in those functions 'include_EQ' that defaults to True?

On Fri, 18 Mar 2016, 16:38 Ewa Muszyńska, ewka.mus@gmail.com wrote:

They are in both at the moment. Purely theoretical considerations aside, the problem is mostly with get_in/out_nodes because it's not straightforward there to filter out the ones from EQ links afterwards.

On Fri, 18 Mar 2016, 16:35 Alex Kuhnle, notifications@github.com wrote:

If so, I would rather include them in both, otherwise things like neighbour nodes etc all get changed. But I think the directionality of EQ links Ann always mentions is something that is probably coming relatively soon, and then it makes sense. Would it work for you to just filter the resulting links accordingly?

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/delph-in/pydmrs/issues/11#issuecomment-198442742

AlexKuhnle commented 8 years ago

Yes, why not, at least until this directionality thing is sorted out.

guyemerson commented 8 years ago

At the moment, we don't treat them differently from anything else, so they'll only be in both directions if links for both directions have been added.

I wouldn't mind that parameter.

emm68 commented 8 years ago

Problem with EQ links. Link(2, 3, 'None', 'EQ') and Link(3, 2, 'None', 'EQ') are not the same links. Moreover, if we do dmrs.iter_outgoing(2) only the first one would be returned. Alternatively, for dmrs.iter_outgoing(3) - only the second one. This doesn't seem right to me.

emm68 commented 8 years ago

Some options are: 1) add extra checks to allow symmetry and ensure that EQ links are included both in outgoing and incoming 2) exclude EQ links from outgoing and incoming lists 3) force some temporary directionality on EQ links, e.g. from smaller nodeid to larger, until we decide what to do with them properly, and include them in the outgoing/incoming appropraitely using this directionality

AlexKuhnle commented 8 years ago

Since we talked about working towards directed EQ links, at least to a certain degree (so our current state, I think) and potentially somebody already tries to direct them (like PyDelphin does), I would argue against excluding them or treating them differently, and also against changing their direction as compared to how they are inputted (changing it according to node id would be equally arbitrary as leaving it as it was, if there is no principle underlying the changing strategy). However, for my matching code, I so far allow reversely directed EQ links to match - maybe something similar could work for you too?

emm68 commented 8 years ago

It's not for any of my applications, but general concern. I noticed this when writing tests. What you're talking about would be option 1: explicitly checking for EQ links to ensure symmetry during iteration. I also think that for now this seems the most reasonable.

On Mon, 9 May 2016 at 15:11 Alex Kuhnle notifications@github.com wrote:

Since we talked about working towards directed EQ links, at least to a certain degree (so our current state, I think) and potentially somebody already tries to direct them (like PyDelphin does), I would argue against excluding them or treating them differently, and also against changing their direction as compared to how they are inputted (changing it according to node id would be equally arbitrary as leaving it as it was, if there is no principle underlying the changing strategy). However, for my matching code, I so far allow reversely directed EQ links to match - maybe something similar could work for you too?

— You are receiving this because you commented.

Reply to this email directly or view it on GitHub https://github.com/delph-in/pydmrs/issues/11#issuecomment-217875364

AlexKuhnle commented 8 years ago

Oh yes, what I meant is that I would leave it to the application code to do so, instead of integrating it into core.py (not sure whether you meant this). I feel this is the better way to do it, if only to preserve the DAG property.

goodmami commented 8 years ago

Hi, to be accurate for comparison, pyDelphin tries to treat them as undirected in a few ways, but fails in others. Comparing Xmrs objects built with different directions will show they are equal:

>>> from delphin.mrs import dmrx
>>> x1 = dmrx.loads_one(
...   '<dmrs-list>'
...   '<dmrs>'
...   '<node nodeid="10000"><realpred lemma="nearly" pos="x" sense="deg" /><sortinfo/></node>'
...   '<node nodeid="10001"><realpred lemma="all" pos="q" /><sortinfo /></node>'
...   '<node nodeid="10002"><realpred lemma="dog" pos="n" sense="1"/><sortinfo cvarsort="x"/></node>'
...   '<link from="10000" to="10001"><post>EQ</post></link>'
...   '<link from="10001" to="10002"><rargname>RSTR</rargname><post>H</post></link>'
...   '</dmrs>'
...   '</dmrs-list>')
>>> x2 = dmrx.loads_one(
...   '<dmrs-list>'
...   '<dmrs>'
...   '<node nodeid="10000"><realpred lemma="nearly" pos="x" sense="deg" /><sortinfo/></node>'
...   '<node nodeid="10001"><realpred lemma="all" pos="q" /><sortinfo /></node>'
...   '<node nodeid="10002"><realpred lemma="dog" pos="n" sense="1"/><sortinfo cvarsort="x"/></node>'
...   '<link from="10001" to="10000"><post>EQ</post></link>'
...   '<link from="10001" to="10002"><rargname>RSTR</rargname><post>H</post></link>'
...   '</dmrs>'
...   '</dmrs-list>')
>>> print(x1==x2)
True

This is True because DMRX deserialization rebuilds the label equalities, resulting in something closer to original MRS. This means, however, that reserialization to DMRX will not necessarily return the same link direction.

PyDelphin fails to treat undirected links as undirected when comparing delphin.mrs.components.Link objects:

>>> from delphin.mrs.components import Link
>>> Link(1, 2, None, 'EQ') == Link(2, 1, None, 'EQ')
False

There a some other ways that I treat them as equal, such as graph traversals using delphin.mrs.path.walk().

Because DMRX serialization necessarily must choose a link direction (one node must be "from" and the other "to"), pyDelphin tries to pick something intuitive by examining the graph structure, whereas the LKB chooses them by CFROM (or maybe nodeid) order. For some reason I chose to have the link go FROM the headiest one to the less heady heads, but I should have done exactly the opposite in order to be consistent with other modifier constellations. I refrain from changing the code now, though, because Ann says there may soon be some developments on this front.

I have a couple questions:

  1. Do you think that Link objects should compare as equal if they only differ by direction and have no rargname (which I take as the indicator of an undirected link)?
  2. Is it preferable that the serialized direction of undirected links is preserved in round-tripping?
guyemerson commented 8 years ago

In the short term... @Ewa - I think (1) and (2) are sensible options, and we could have parameters to allow this behaviour. (3) is weird theoretically and I don't think it helps practically anyway (it still requires us to check both incoming and outgoing sets). @Mike - (1) makes sense (we're not doing this either). I don't know any use case that requires (2), even if it sounds nice.

In the longer term... Ann is promising directed EQ links, so hopefully we won't have to worry about any of this, anyway.

goodmami commented 8 years ago

Thanks, Guy. I agree that ensuring round-trip symmetry is probably unnecessary.

In the longer term... Ann is promising directed EQ links, so hopefully we won't have to worry about any of this, anyway.

I haven't seen the proposal yet, but I imagine we'll need to maintain some compatibility with current-generation DMRSs. Therefore, whatever behavior is deemed desirable should probably be implemented, as long as it's not too much work (it shouldn't be).