CCG categories with coindexed subparts

nschneid commented 5 years ago

E.g. control verbs should coindex the subject NP of the complement with the subject of the matrix verb: something like (S\NP.1)/(S[inf]\NP.1) for "want" and (S[inf]\NP.1)/(S[b]\NP.1) for "to" in "I want to eat cake". Is it possible to obtain this information, perhaps using GraphParser?

If so, then we could tell annotators not to align anything coindexed with part of an outer argument: in the example, the semantic relation corresponding to expected subject of the embedded verb would account for the NP.1 argument, so "want" would not align to any of the semantic args of "eat".

ablodge commented 5 years ago

Look in https://github.com/sivareddyg/graph-parser/blob/master/src/in/sivareddy/graphparser/ccg/CategoryIndex.java

ablodge commented 5 years ago

From: Reddy, S., Lapata, M., & Steedman, M. (2014). Large-scale semantic parsing without question-answer pairs. Transactions of the Association of Computational Linguistics, 2(1), 377-392.

ablodge commented 5 years ago

In python:

def add_indices(word, pos):
  if re.match('(VB.*|IN|TO|POS)', pos):          # S\NP.1/NP.2
  elif re.match('(NN|NNS)', pos):                  # NP
  elif re.match('(NNP.*|PRP.*)', pos):             # NP
  elif re.match('RB.*', pos):                        # S.1\S.1
  elif re.match('JJ.*', pos):                        # NP.1/NP.1
  elif re.match('(be|is|was|were|am|are)', word):      # S\NP.1/NP.2
  elif word == 'the':                                   # NP/N
  elif pos == 'CD':                                      # N.1/N.1
  elif word in ['not',"n't"]:                       # (S.1\NP.1)/(S.1\NP.1)
  elif word == 'no':                                      # (NP/N)
  elif re.match('(WDT|WP.*|WRB)', pos):           # S[wq].1/(S[dcl].1\NP)
                                       # (NP.1\NP.1)/(S[dcl]\NP.1)

ablodge commented 5 years ago

Most cases can be handled by looking for modifiers: quickly : (S.1\NP.2)/(S.1\NP.2)

Here's a difficult cases for (object/subject) control: a. "Hasina wants the military to arrest real criminals..." want : (S[dcl].1\NP.2)/(S[to].1\NP.3)/NP.3 similar to b. "Hasina persuaded the military to arrest real criminals..." persuade : (S[dcl].1\NP.2)/(S[to].1\NP.3)/NP.3 vs c. "Hasina wants to arrest real criminals..." want : (S[dcl].1\NP.2)/(S[to].1\NP.2) similar to d. "Hasina promised John to arrest real criminals..." promised : (S[dcl].1\NP.2)/(S[to].1\NP.2)/NP.3

Also modals: e. "The ruling authorities actually should pay attention" should : (Sdcl.1\NP.2)/(Sb.1\NP.2) Same pattern of control as (c). r / recommend-01 :ARG1 (p / pay-01 :ARG0 a) :ARG2 a/authority

There isn't an obvious way to distinguish (d) from (a) or (b). So, I will need a list of English control verbs and modals.

nschneid commented 5 years ago

Another thought: have you looked at Boxer? EasyCCG apparently gives "Boxer compatible Prolog output", whatever that means.

On Sat, Nov 10, 2018, 7:34 PM ablodge <notifications@github.com wrote:

Most cases can be handled by looking for modifiers: quickly : (S.1\NP.2)/(S.1\NP.2) Here's a difficult cases for (object/subject) control: "Hasina wants the military to arrest real criminals..." want : (S[dcl].1\NP.2)/(S[to].1\NP.3)/NP.3 similar to "Hasina persuaded the military to arrest real criminals..." persuade : (S[dcl].1\NP.2)/(S[to].1\NP.3)/NP.3 vs "Hasina wants to arrest real criminals..." want : (S[dcl].1\NP.2)/(S[to].1\NP.2) similar to "Hasina promised to everyone to arrest real criminals..." promised : (S[dcl].1\NP.2)/(S[to].1\NP.2)/PP

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ablodge/amr-ccg-alignment/issues/10#issuecomment-437633331, or mute the thread https://github.com/notifications/unsubscribe-auth/AA8Ir4XlLSf1E7sczPCM_xsjZCwDQMcBks5ut3B-gaJpZM4YTuSH .

ablodge commented 5 years ago

I'll take a look. Just for reference, the attested modal/control verbs in the data are below. All of the following take a VP as an argument and output a VP with different features. Auxilliaries, modals, and control verbs all follow this patterns.

S[adj].1\NP.2/(S[to].1\NP.2) : ready JJ
S[adj].1\NP.2/(S[to].1\NP.2) : sure JJ
S[adj].1\NP.2/(S[to].1\NP.2)/(S[adj]\NP) : too RB
S[b].1\NP.2/(S[adj].1\NP.2) : be VB
S[b].1\NP.2/(S[adj].1\NP.2) : become VB
S[b].1\NP.2/(S[adj].1\NP.2) : go VB
S[b].1\NP.2/(S[adj].1\NP.2)/NP : make VB
S[b].1\NP.2/(S[pss].1\NP.2) : be VB
S[b].1\NP.2/(S[pt].1\NP.2) : have VB
S[b].1\NP.2/(S[to].1\NP.2) : claim VB
S[b].1\NP.2/(S[to].1\NP.2) : dare VBP
S[b].1\NP.2/(S[to].1\NP.2) : have VB
S[b].1\NP.2/(S[to].1\NP.2) : try VB
S[b].1\NP.2/(S[to].1\NP.2)/NP : require VB
S[b].1\NP.2/(S[to].1\NP.2)/NP : want VB
S[dcl].1\NP.2/(S[adj].1\NP.2) : 's POS
S[dcl].1\NP.2/(S[adj].1\NP.2) : 's VBZ
S[dcl].1\NP.2/(S[adj].1\NP.2) : are VBP
S[dcl].1\NP.2/(S[adj].1\NP.2) : came VBD
S[dcl].1\NP.2/(S[adj].1\NP.2) : got VBD
S[dcl].1\NP.2/(S[adj].1\NP.2) : is VBZ
S[dcl].1\NP.2/(S[adj].1\NP.2) : was VBD
S[dcl].1\NP.2/(S[b].1\NP.2) : 'd MD
S[dcl].1\NP.2/(S[b].1\NP.2) : 'll MD
S[dcl].1\NP.2/(S[b].1\NP.2) : can MD
S[dcl].1\NP.2/(S[b].1\NP.2) : could MD
S[dcl].1\NP.2/(S[b].1\NP.2) : did VBD
S[dcl].1\NP.2/(S[b].1\NP.2) : do VBP
S[dcl].1\NP.2/(S[b].1\NP.2) : might MD
S[dcl].1\NP.2/(S[b].1\NP.2) : must MD
S[dcl].1\NP.2/(S[b].1\NP.2) : shall MD
S[dcl].1\NP.2/(S[b].1\NP.2) : should MD
S[dcl].1\NP.2/(S[b].1\NP.2) : will MD
S[dcl].1\NP.2/(S[b].1\NP.2) : would MD
S[dcl].1\NP.2/(S[b].1\NP.2)/NP : can MD
S[dcl].1\NP.2/(S[b].1\NP.2)/NP : do VBP
S[dcl].1\NP.2/(S[b].1\NP.2)/NP : let VB
S[dcl].1\NP.2/(S[b].1\NP.2)/NP : made VBD
S[dcl].1\NP.2/(S[b].1\NP.2)/NP : make VBP
S[dcl].1\NP.2/(S[ng].1\NP.2) : 'm VBP
S[dcl].1\NP.2/(S[ng].1\NP.2) : 's VBZ
S[dcl].1\NP.2/(S[ng].1\NP.2) : are VBP
S[dcl].1\NP.2/(S[ng].1\NP.2) : began VBD
S[dcl].1\NP.2/(S[ng].1\NP.2) : is VBZ
S[dcl].1\NP.2/(S[ng].1\NP.2) : stopped VBD
S[dcl].1\NP.2/(S[ng].1\NP.2) : was VBD
S[dcl].1\NP.2/(S[ng].1\NP.2) : were VBD
S[dcl].1\NP.2/(S[ng].1\NP.2)/NP : Am VBP
S[dcl].1\NP.2/(S[ng].1\NP.2)/NP : are VBP
S[dcl].1\NP.2/(S[pss].1\NP.2) : am VBP
S[dcl].1\NP.2/(S[pss].1\NP.2) : are VBP
S[dcl].1\NP.2/(S[pss].1\NP.2) : be VB
S[dcl].1\NP.2/(S[pss].1\NP.2) : is VBZ
S[dcl].1\NP.2/(S[pss].1\NP.2) : was VBD
S[dcl].1\NP.2/(S[pss].1\NP.2) : were VBD
S[dcl].1\NP.2/(S[pt].1\NP.2) : felt VBD
S[dcl].1\NP.2/(S[pt].1\NP.2) : had VBD
S[dcl].1\NP.2/(S[pt].1\NP.2) : has VBZ
S[dcl].1\NP.2/(S[pt].1\NP.2) : have VBP
S[dcl].1\NP.2/(S[to].1\NP.2) : appear VBP
S[dcl].1\NP.2/(S[to].1\NP.2) : continues VBZ
S[dcl].1\NP.2/(S[to].1\NP.2) : decided VBD
S[dcl].1\NP.2/(S[to].1\NP.2) : have VBP
S[dcl].1\NP.2/(S[to].1\NP.2) : needs VBZ
S[dcl].1\NP.2/(S[to].1\NP.2) : seemed VBD
S[dcl].1\NP.2/(S[to].1\NP.2) : want VBP
S[dcl].1\NP.2/(S[to].1\NP.2) : was VBD
S[dcl].1\NP.2/(S[to].1\NP.2) : wish VBP
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : cause VBP
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : caused VBD
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : costs VBZ
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : emailed VBD
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : have VBP
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : helps VBZ
S[dcl].1\NP.2/(S[to].1\NP.2)/NP : wants VBZ
S[dcl].1\NP[expl].2/(S[to].1\NP.2)/(S[adj]\NP) : was VBD
S[dcl].1\NP[thr].2/(S[pt].1\NP.2) : has VBZ
S[ng].1\NP.2/(S[pss].1\NP.2) : being VBG
S[ng].1\NP.2/(S[to].1\NP.2) : going VBG
S[ng].1\NP.2/(S[to].1\NP.2) : trying VBG
S[ng].1\NP.2/(S[to].1\NP.2)/NP : for IN
S[pss].1\NP.2/(S[to].1\NP.2) : invented VBN
S[pt].1\NP.2/(S[adj].1\NP.2) : become VBN
S[pt].1\NP.2/(S[adj].1\NP.2) : been VBN
S[pt].1\NP.2/(S[ng].1\NP.2) : been VBN
S[pt].1\NP.2/(S[pss].1\NP.2) : been VBN
S[to].1\NP.2/(S[b].1\NP.2) : to TO

ablodge commented 5 years ago

Boxer is here

ablodge commented 5 years ago

I think a sensible rule is

S[?]\NP/(S[?]\NP) should have indices S[?].1\NP.2/(S[?].1\NP.2); From what I can see, this handles auxiliaries, raising, and subject-control the way that it should.
S[?]\NP/(S[?]\NP)/NP should have indices S[?].1\NP.2/(S[?].1\NP.3)/NP.3; This will handle object-control. I think cases like 'promise' in (e) above will be rare enough not to worry about. (Noticeable exception: "costs" S[dcl].1\NP.2/(S[to].1\NP.3)/NP.4, as in "The pill costs 5 cents to produce.")

nschneid commented 5 years ago

S[dcl].1\NP.2/(S[to].1\NP.2)/NP : emailed VBD
S[pss].1\NP.2/(S[to].1\NP.2) : invented VBN

What are the examples for these? Are you sure the infinitivals aren't purpose adjuncts?

nschneid commented 5 years ago

(Noticeable exception: "costs" S[dcl].1\NP.2/(S[to].1\NP.3)/NP.4, as in "The pill costs 5 cents to produce.")

There's a family of verbs licensing infinitivals that involve some sort of resource requirement: in addition to "cost", there's "take", "require", etc., where the subject is what the resource pays for.

Are you sure you want to coindex the two Ses? Wouldn't that equate the costing and producing predicates?

nschneid commented 5 years ago

Maybe a better category for "costs" is S[dcl]\NP.1/(S[to]\NP.2/NP.1)/NP.3 ?

ablodge commented 5 years ago

Are you sure you want to coindex the two Ses? Wouldn't that equate the costing and producing predicates?

Okay, to simplify the indexing, I think I should just index NPs.

ablodge commented 5 years ago

S[dcl].1\NP.2/(S[to].1\NP.2)/NP : emailed VBD S[pss].1\NP.2/(S[to].1\NP.2) : invented VBN What are the examples for these? Are you sure the infinitivals aren't purpose adjuncts?

You're right they are adjuncts in the examples.

ablodge / amr-ccg-alignment

CCG categories with coindexed subparts #10