opencog / link-grammar

The CMU Link Grammar natural language parser
GNU Lesser General Public License v2.1
386 stars 119 forks source link

phantom infinitives #1234

Open linas opened 3 years ago

linas commented 3 years ago

Failures to parse:

I will, provided he goes to the store.
I'm going to, provided they have ice cream
I'm going to, if she goes first.
I'm going to, after he leaves.

Works

I'm going to walk there, provided they have ice cream

Reported by Stephen Frechette 14 June 2021 via email

linas commented 3 years ago

More examples:

I will, if he goes to the store
I will, after he goes to the store
I will, when he goes to the store

This works:

I will stop provided he does too.

Diagnosis: these are all phantom-infinitive sentences, of the form, I will [do something], if/when/after/provided he does. This is a special case of issue #224 from long ago.

More examples:

promises, hopes, vows:

I'm going to, after he leaves
I'm hoping to, after he leaves
I hope to, after he leaves
He's sure to, after she leaves
He's bound to, after she leaves
He vowed to, after she leaves
He promised to, after she leaves
he can't wait to, but only after she leaves

will/must/could/would

He must, but only after the ceremony
He might, but only after the ceremony
He could, but only after the ceremony
He should, but only after the ceremony
stephenfrechette commented 3 years ago

There seem to me to be two cases where this happens, either with hanging auxiliary verbs, or with hanging "to" particles.

linas commented 3 years ago

There are two possible fixes. The first is to invent a new phantom-word mechanism, that would result in a parse such as this:

                                        +------------Xc------------+
    +------->WV------------>+----MVs----+----CV-->+                |
    +->Wd--+-Sp*i+-----I----+       +-Xd+-Cs+--Ss-+--MVa-+         |
    |      |     |          |       |   |   |     |      |         |
LEFT-WALL I.p will.v [context-verb] , if.r she goes.v first.a RIGHT-WALL

where [context-verb] or [phantom infinitive] is inserted during tokenization, and used explicitly in the parse. Doing this would solve this and the problems outlined in #224, however, LG does not currently do anything like this, and it would require a large change to the tokenizer to add alternatives like this.

The other possibility is to have

                         +------------Xc------------+
    +---->WV---->+--XXX--+----CV-->+                |
    +->Wd--+-Sp*i+   +-Xd+-Cs+--Ss-+--MVa-+         |
    |      |     |   |   |   |     |      |         |
LEFT-WALL I.p will.v , if.r she goes.v first.a RIGHT-WALL

but it is not clear what XXX should be. Do we need to invent a new link type? Can some existing link type be pressed into service for this task?

Inventing a new link type (or recycling an existing one) is much easier than redesigning the parser. On the other hand, dealing with phantom words by making the implicit reference explicit seems like a better way of handling semantics.

stephenfrechette commented 3 years ago

What about the option of having a new variant of the "S" linkage in the case of auxiliaries, and a new variant of the "TO" linkage in the case of hanging "to" particles?

linas commented 3 years ago

S connects subjects to verbs. I can't use S for XXX in the diagram above.

stephenfrechette commented 3 years ago

In my suggestion, in the example above, the MVs connection would link to the "will". (basically, the auxiliary becomes the main verb). I do not know if you need any XXX connection for the construction with "to".

linas commented 3 years ago

Yes, perhaps XXX could be some variant of the MV link. The general discussion of what to do about phantom words is taking place in #224