propbank / propbank-frames

Lexicon of frame files used by Propbank annotation. A searchable, readable version of the latest release is here: http://propbank.github.io/v3.4.0/frames/
Creative Commons Attribution Share Alike 4.0 International
95 stars 27 forks source link

missing light verbs senses #6

Open arademaker opened 4 years ago

arademaker commented 4 years ago

In the EWT corpus, we have many light verbs:

% find . -name '*.prop' -exec awk '$6 ~ /\.LV/ {print $6}' {} \; | sort | uniq -c | sort -nr
 244 have.LV
 151 make.LV
 104 take.LV
  75 do.LV
  66 give.LV
  27 get.LV
  11 pay.LV
   2 perform.LV
   2 keep.LV
   2 hold.LV
   1 strike.LV
   1 set.LV
   1 put.LV
   1 open.LV
   1 go.LV
   1 extend.LV
   1 come.LV

But in the frame files, I found only three cases of senses with the suffix .LV defined: keep, make and take:

(("keep.01" "keep.02" "keep.03" "keep.04" "keep.06" "keep.LV" "keep_up.05"
  "keep_up.10" "keep_on.08")
 ("make.01" "make.02" "make.05" "make.06" "make.18" "make.19" "make.27"
  "make.LV" "make_up.07" "make_up.08" "make_up.09" "make_up.10" "make_up.11"
  "make_up.16" "make_out.12" "make_out.13" "make_out.15" "make_out.23"
  "make_out.26" "make_it.14" "make_off.17" "make_over.22" "make_believe.24"
  "make_do.25" "making.03")
 ("take.01" "take.02" "take.03" "take.04" "take.14" "take.15" "take.10"
  "take.16" "take.17" "take.25" "take.LV" "take.32" "take.34" "take_away.05"
  "take_in.06" "take_in.23" "take_off.07" "take_off.08" "take_off.19"
  "take_off.18" "take_off.33" "take_on.09" "take_on.21" "take_out.11"
  "take_out.26" "take_out.27" "take_out.28" "take_over.12" "take_up.13"
  "take_up.30" "take_up.31" "take_aback.20" "take_down.22" "take_hold.24"
  "taking.29"))

Also note that for lemma have the light verb sense should be have.06, right?

I am assuming the frame files are outdated, am I right?

timjogorman commented 4 years ago

Yes and no!

Light verbs are treated as an "open class", so during annotation, any given predicate can be labeled as a light verb -- annotators always have ".LV" as a sense option. This was done so that we didn't add any bias by pre-defining which predicates can have a "light verb" function. So for the sake of annotation, the "weird" thing is that we have keep.LV, make.LV and take.LV; to a certain extent those are relics of the days before such an open-class approach, and I vaguely recall they were left in there because having the explicit lexical entry means we provide examples. So from the perspective of "how are light verbs annotated", not having LV entries for all those predicates is quite correct.

All that being said, that "open class" assumption is probably quite annoying from the perspective of actually doing roleset sense disambiguation -- while open class makes sense for annotation, I presume we don't want to have to consider for all monosemous words like "titrate" whether they have a possible "titrate.LV" second sense during predicate sense disambiguation. I'll ponder if we can do anything to make this easier for that case.

(@cbonial , feel free to chime in on this if you have thoughts on this, being the actual PropBank LV expert)

MarthaSPalmer commented 4 years ago

Just having Alexander's list of currently annotated .LV’s somewhere visible, along with this explanation, could be quite helpful.

Martha

On Apr 14, 2020, at 9:30 AM, timjogorman notifications@github.com<mailto:notifications@github.com> wrote:

Yes and no!

Light verbs are treated as an "open class", so during annotation, any given predicate can be labeled as a light verb -- annotators always have ".LV" as a sense option. This was done so that we didn't add any bias by pre-defining which predicates can have a "light verb" function. So for the sake of annotation, the "weird" thing is that we have keep.LV, make.LV and take.LV; to a certain extent those are relics of the days before such an open-class approach, and I vaguely recall they were left in there because having the explicit lexical entry means we provide examples. So from the perspective of "how are light verbs annotated", not having LV entries for all those predicates is quite correct.

All that being said, that "open class" assumption is probably quite annoying from the perspective of actually doing roleset sense disambiguation -- while open class makes sense for annotation, I presume we don't want to have to consider for all monosemous words like "titrate" whether they have a possible "titrate.LV" second sense during predicate sense disambiguation. I'll ponder if we can do anything to make this easier for that case.

(@cbonialhttps://github.com/cbonial , feel free to chime in on this if you have thoughts on this, being the actual PropBank LV expert)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/propbank/propbank-frames/issues/6#issuecomment-613511329, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABB327UFEVBKI3J5XWN3NITRMR6QZANCNFSM4MGEE6IQ.

cbonial commented 4 years ago

Tim, you've summarized this quite nicely. The remnant .LV senses are relics that replaced what used to be numbered light verb senses from when we did not have a comprehensive light verb approach that shifted the predicate argument structure of these cases to the eventive or stative noun. However, I think it might be useful to both annotators and users to add .LV and examples to all frame files that have a .LV usage attested. Although this is easy for me to say, since I am no longer there and wouldn't have to actually do it!

arademaker commented 4 years ago

Hi all, thank you for the explanations. Do we have any article that describes these decisions on light verbs and its open class approach?

In particular, what is the problem with

This was done so that we didn't add any bias by pre-defining which predicates can have a "light verb" function.

?

cbonial commented 4 years ago

Here is an LREC paper on the PropBank LV annotation: https://www.aclweb.org/anthology/L16-1628.pdf

arademaker commented 4 years ago

Hi @MarthaSPalmer and @cbonial , I wonder if the example in

https://github.com/propbank/propbank-frames/blob/master/frames/get.xml#L40

would be a case of get.LV annotation instead of get.01 sense...