simongray / StatementAnnotator

Custom annotator for Stanford CoreNLP that annotates sentences with the underlying statements contained within them.
4 stars 0 forks source link

How to deal with parentheses? #38

Open simongray opened 8 years ago

simongray commented 8 years ago

I honestly have no idea, will need to think about it.

simongray commented 8 years ago

Here's an example:

// TODO: double subjects, confusion caused by parentheses)
String example = "I sure was (I come from Copenhagen, Denmark).";

creating this output:

I sure was (I come from Copenhagen, Denmark).
    |_ statement: {Statement: "I sure was -LRB- I come from Copenhagen, Denmark -RRB-", components: 4}
        |_ component: {Subject: "I"}
        |_ component: {IndirectObject: "from Copenhagen, Denmark"}
        |_ component: {Verb: "sure"}
        |_ component: {Subject: "I"}

The dependency graph is not very helpful in this case, unfortunately:

[sure/RB
  nsubj>I/PRP
  dep>[was/VBD
       dep>[come/VBP
            punct>-LRB-/-LRB-
            nsubj>I/PRP
            nmod:from>[Copenhagen/NNP case>from/IN punct>,/, appos>Denmark/NNP]
            punct>-RRB-/-RRB-]]
  punct>./.]
simongray commented 8 years ago

It seems like I could create a custom annotator after the tokenisation stage which

simongray commented 8 years ago

Perhaps the above is too complex for the time I have left and simply preprocessing to remove parentheses and emoticons is the better solution.