google / sling

SLING - A natural language frame semantics parser
Apache License 2.0
1.93k stars 268 forks source link

Can a single span invoke 2 different frames? (CASPAR) #191

Closed jackxpeng closed 6 years ago

jackxpeng commented 6 years ago

Can a single span invoke 2 different frames? I'm confused. Here "price" invokes both #11 /saft/other and #14 /pb/price-01? This doesn't look right to me.

{=#1
  :/s/document
  /s/document/text: "Tell me price of the car"
  /s/document/tokens: [{=#2
    :/s/token
    /s/token/index: 0
    /s/token/text: "Tell"
    /s/token/start: 0
    /s/token/length: 4
    /s/token/break: 0
  }, {=#3
    :/s/token
    /s/token/index: 1
    /s/token/text: "me"
    /s/token/start: 5
    /s/token/length: 2
  }, {=#4
    :/s/token
    /s/token/index: 2
    /s/token/text: "price"
    /s/token/start: 8
    /s/token/length: 5
  }, {=#5
    :/s/token
    /s/token/index: 3
    /s/token/text: "of"
    /s/token/start: 14
    /s/token/length: 2
  }, {=#6
    :/s/token
    /s/token/index: 4
    /s/token/text: "the"
    /s/token/start: 17
    /s/token/length: 3
  }, {=#7
    :/s/token
    /s/token/index: 5
    /s/token/text: "car"
    /s/token/start: 21
    /s/token/length: 3
  }]
  /s/document/mention: {=#8
    :/s/phrase
    /s/phrase/begin: 0
    /s/phrase/evokes: {=#9
      :/pb/tell-01
      /pb/arg2: {=#10
        :/saft/other
      }
      /pb/arg1: {=#11
        :/saft/other
      }
    }
  }
  /s/document/mention: {=#12
    :/s/phrase
    /s/phrase/begin: 1
    /s/phrase/evokes: #10
  }
  /s/document/mention: {=#13
    :/s/phrase
    /s/phrase/begin: 2
    /s/phrase/evokes: #11
    /s/phrase/evokes: {=#14
      :/pb/price-01
      /pb/arg1: {=#15
        :/saft/consumer_good
      }
    }
  }
  /s/document/mention: {=#16
    :/s/phrase
    /s/phrase/begin: 5
    /s/phrase/evokes: #15
  }
}
ringgaard commented 6 years ago

A mention, i.e. a span, can evoke multiple frames, but in this case it seems to be a parse error.

jackxpeng commented 6 years ago

Got it. Do you have an example sentence that has a mention that evokes multiple frames in a correct parse? I'm very curious.

Another surprise to me is that price-01 can have multiple meanings. http://verbs.colorado.edu/propbank/framesets-english-aliases/price.html says it can have 3 different meanings, all under price-01. I had thought propbank was very precise.

Example: Arg 1 only
        The price of gasoline has soared in recent months.

        Rel: price
        Arg1: of gasoline
ringgaard commented 6 years ago

A sentence like "The bombing of the city continued." will evoke both a /pb/bomb-01 frame and a /saft/event frame for "bombing", which is correct according to the training data. However, this is really just an artifact of the way the training data has been generated from OntoNotes, where the SRL annotations produces the /pb/bomb-01 frame and the entity annotations produces the /saft/event frame. There is an overlap between the two types of annotations because of the nominalization of the verb "bomb", so this example is somewhat artificial.

In general, it is useful to have a formalism where a word or phrase can evoke multiple frames, e.g. for metonymy, where the same word can evoke multiple meanings. In the sentence "The White House will be announcing the decision around noon today.", [The White House] can refer to both the organization (US executive branch) and the location (1600 Penn Ave). The SLING parser supports making these distinctions, but the conversion of the OntoNotes corpus we are currently using for training the parser does not systematically annotate these distinctions, so the SEMPAR parser will not get these correct, but we hope to have better training data in the future so we can produce more semantically concise annotations.

Many "old-school" NLP tasks have been defined as span labeling tasks, but this breaks down when you want to make these finer distinctions. This is the reason that we use a the more general semantic frame formalism in SLING. It might seem like unneeded nitpicking wanting to make this kind of distinctions, but if you want to produce proper annotations that is useful for semantic analysis, this becomes important. However, we are currently "slaves" of our annotated corpora we use for training, so while the basic SLING parser support learning this, the current SEMPAR parser is not able to do this properly.

With respect to the PropBank predicates like price-01, role sets are often lumping together different senses. It is hard to agree on the exact set of senses for a word, so PropBank usually puts together senses that have the same predicate/argument structure. This makes it is easy to annotate with PropBank frames (compared to other frame inventories like FrameNet), but it unfortunately also means that even if we get the "correct" PropBank-based analysis of a sentence, it is not always super useful for semantic analysis.

jackxpeng commented 6 years ago

Thank you for such a detailed response, Michael.