simongray / StatementAnnotator

Custom annotator for Stanford CoreNLP that annotates sentences with the underlying statements contained within them.
4 stars 0 forks source link

No conjunctions #44

Closed simongray closed 8 years ago

simongray commented 8 years ago

(subsumes issue #43)

Diagnosing the problem

At this point, it seems like the most accurate statements are statements without conjunctions in the components. For example, the sentence

The app allows you to check the latest PM2.5 index inside your flat and automate the purifiers.

In this case, the nested statement ("to check the latest PM2.5 index inside your flat and automate the purifiers") would comprise:

The problem with this setup is that it is unclear which part of the DirectObject that check and automate refer to and also unclear which (or both) of the verbs are related to the IndirectObject. It also results in weird cases like the lack of "and" in the DirectObject text.

The improvement

The following nested statements are found:

  1. [{Verb: "check"}, {DirectObject: "the latest PM2.5 index"}, {IndirectObject: "inside your flat"}]
  2. [{Verb: "automate"}, {DirectObject; "the purifiers"}]

Which results in the following full statements:

  1. [{Subject: "the app"}, {Verb: "allows"}, {DirectObject: "you"}, [{Verb: "check"}, {DirectObject: "the latest PM2.5 index"}, {IndirectObject: "inside your flat"}]]
  2. [{Subject: "the app"}, {Verb: "allows"}, {DirectObject: "you"}, [{Verb: "automate"}, {DirectObject; "the purifiers"}]]

    How to do it

When creating statements, check each component once for connections to every other component -- component.connectedTo(othercomponent) -- and use this set of connected components as a statement.