LIGHT on extracted-adj-phrase

olzama commented 3 years ago

The extracted adjunct phrase currently constrains the head daughter to be LIGHT + and underspecifies it on the mother. This is incorrect, probably also because of the underspecification, but first of all, it prevents e.g. SOV grammars with obligatory question fronting from licensing even simple question sentences with transitive verbs and extracted adjuncts.

Addressing this issue may affect a number of wh-ques regression tests. The feature being there currently ensures some desired behavior. (But I am not sure this could be pushed down to customization, at least not with the mother remaining underspecified wrt LIGHT?).

emilymbender commented 2 years ago

[ LIGHT + ] on the daughter means that we block adjunct extraction from applying higher in the tree. This is probably to reduce ambiguity (otherwise there would be many possible extraction sites). Leaving it underspecified on the mother strikes me as less problematic: my guess is that most things that check the LIGHT value on a daughter are looking for compatibility with [ LIGHT + ], but it would be worth looking to see if anything wants [ LIGHT - ].

Regarding SOV + extracted adjuncts being blocked ... I assume this is because the adjuncts themselves also require [ COMPS < > ] on the the constituent they modify?

I'd guess the first thing to try would be to remove [ LIGHT + ] from the daughter of this rule to see how much it increases ambiguity in existing tests.

olzama commented 2 years ago

If I take out the constraint, a bunch of tests start failing, including not only ambiguity but also new parses of ungrammatical things. However, all of these tests are wh- tests (with the caveat that there are some tests that are currently being skipped, for a number of unrelated reasons; there is some documentation of that if you run ./rtest.py --list --skipped --verbose. )

So, maybe the only problematic area here is the wh-analyses.

(1) Sentences like Where dogs sleep? start parsing, but this is most likely due to an inadequate analysis of the English auxiliary + extraction. So, in this sense, maybe not a very worrysome regression, or at least a special one.

(2) The additional ambiguity that arises looks, in particular, like this (the additional trees are with the red annotation):

(2) this is also from English and is a complex sentence, but other languages (which do not have auxiliary inversion) still rely on adjunct extraction for sentences meaning "Where does someone go" and similar, (not necessarily complex ones), and so will see this kind of regression too.

emilymbender commented 2 years ago

Yeah, the issue with (1) seems like a separate issue --- and not so much about extraction as about questions. English main clause wh questions should require [ INV + ] except where the wh word is the main clause subject.

For (2), this is roughly the expected outcome, I think, because [ LIGHT + ] was ensuring low application of the rule. But what does the rule or the wh adverb say about the valence features of the head-daughter/modified constituent? Is there any reason not to have where and when set up to modify say specifically S? That is, force high extraction?

olzama commented 2 years ago

Is there any reason not to have where and when set up to modify say specifically S? That is, force high extraction?

If I remember right, this will break multiple fronting, but in principle we could then customize this separately.

So, you are suggesting to add [ SYNSEM.LOCAL.CAT.HEAD.MOD < [ LOCAL.CAT.VAL [ SUBJ < >, COMPS < >] ] > ] to the adverb?

emilymbender commented 2 years ago

That's the idea -- I'd be curious to see the result. What is the current constraint on the VAL of the MOD of the adverbs though? Perhaps the other direction is to keep [ LIGHT + ] on the rule but then make a version of the adverb that is only for head-filler use and doesn't constrain its MOD's VAL?

olzama commented 2 years ago

Here's a simpler example from Russian, the glosses are where Ivan goes (meaning, "Where does Ivan go?"):

So, the adverb gets an S either way... It's actually the case with the above English examples as well, it's just that example is confusing I think (sorry).

olzama commented 2 years ago

What is the current constraint on the VAL of the MOD of the adverbs though?

I don't see any, actually.

olzama commented 2 years ago

make a version of the adverb that is only for head-filler use

Meaning, it modifies the... mother of the head-filler? (There is something like that in the ERG, but something very special, I think, having to do with the anti-synsem thing, or something?)

Because if you meant, it can only be a daughter of the head-filler, then that won't help because it is a daughter of the head-filler in all these examples.

I am so rusty with all this :D

emilymbender commented 2 years ago

-- Can only be daughter of head filler, but on this branch, we're talking about putting [ LIGHT + ] back on. -- If the adverb doesn't care about the VAL of its MOD, I don't actually understand why [ LIGHT + ] was causing problems. Can you (or @lizcconrad !) look into the causes of parse failure for the relevant example in the original system? [ LIGHT + ] should be insisting on low application of extracted-adj, but something else is blocking that low application, and I think we need to know what that something else is.

olzama commented 2 years ago

From page 354 of my dissertation:

olzama commented 2 years ago

I commented in the other bug: https://github.com/delph-in/matrix/issues/591

delph-in / matrix

LIGHT on extracted-adj-phrase #590