CQCL / text_to_discocirc

Apache License 2.0
4 stars 1 forks source link

n type expansion code is incomplete #30

Open JosephNathaniel opened 1 year ago

JosephNathaniel commented 1 year ago

The sentence I saw the man you love breaks the current n-type expansion code.

A possibility is that in the noun phrase the man you love, the head is the noun man, which is diagrammatically on the right side on the noun phrase (i.e. it is the direct object), rather than the left side of the noun phrase (i.e. the subject) like in our previous examples of n-expansion (e.g. I saw Alice who loves Bob). As such, all the previous examples somehow just miraculously worked, even ones like Alice who loves Bob hates Claire, in which you'd expect a similar swap to this example.

Below is the diagram of the term directly obtained from the CCG parse:

image

(another strange thing in this example is that for some reason, the n-type wire coming out of loves is not labelled with a proper coindex, but if you check the .head of the sub-term corresponding to the man you love, it will correctly tell you the head is man_4. This problem does not seem to affect the n-type expansion algorithm however)

When this term goes into the current n-type expansion code, we get the following:

image

The coindices of the types in this term don't match properly, so if you try to draw the coindices as well by calling expr_add_indices_to_types, it will break.

What we would like the diagram to look like after n-type expansion is probably something like this:

image

but this would also raise problems re s-type expansion, since the order of the n wires would be mismatched between the top and the bottom

JosephNathaniel commented 1 year ago

I understand now why there is the asymmetry -- i.e. why I saw the man you love breaks but Alice who likes Bob hates Claire does not. The reason this works is due to our convention that "higher arguments on the lambda tree correspond to the right-hand side diagrammatically". See attached file for more details n-type expansion asymmetry.pdf

The latter sentence comes out to the following if we put it into the current pipeline: image A bit dodgy, but morally correct. The fact that not all the nouns are at the top is probably ok, given we assume that all this stuff happens after pulling out (so the nouns are not inside frames)

JosephNathaniel commented 1 year ago

update: partially fixed in c99292d

I implemented a solution where, given an NP that is to be n-expanded, we always swap the head noun to the left, so that the existing code always works. Note this means wire_index is always 0, and so renders some bits of the existing code redundant.

So, in the I saw the man you love example, we now obtain the following output: image

Two issues remain:

  1. when we swap the head noun to the left of the NP, we should probably also do the inverse swap at the top of the NP, so that the ordering of the noun wires remains consistent top-to-bottom
  2. the problem that the overall NP has no coindex (just n['6_2']) means that bringing indices into this term/diagram still throws an error