Open GoogleCodeExporter opened 9 years ago
[deleted comment]
Hi
jate uses opennlp 1.51, verb phrases are a little tricky to handle. You are
right to look at "B-NP" and "I-NP" in the "chunkNP" method in
"NounPhraseExtractorOpenNLP" class, but I think you need to write a separate
method that implements a slightly different process.
Example:
Tokens = They have replaceable teeth .
Chunker output = B-NP,B-VP,B-NP,I-NP,O
Tokens = Humans kill around 26 to 73 million sharks every year ...
Chunker output =B-NP,B-VP,B-ADVP,B-NP,B-PP,B-NP,I-NP,I-NP,B-NP
As you see, B-VP identifies the beginning of a VP, but there are no "I-VP" that
identifies the "inner" of a VP, but rather noun phrases or adverbs/proposition
phrases. So your code need to handle these cases.
This will be added in the next version of this tool.
Original comment by ziqizhan...@googlemail.com
on 25 Sep 2013 at 1:39
Original issue reported on code.google.com by
mihail.m...@gmail.com
on 19 Sep 2013 at 4:55