Closed mlalma closed 10 years ago
We investigated some, and we found that the conversion for corenlp was handling unknown words differently from the version on the web. We made a new model, but because of internal code changes, it is not compatible with the currently released code. If you download our github code from here:
https://github.com/stanfordnlp/CoreNLP
you can get the fixed model here:
http://nlp.stanford.edu/software/stanford-corenlp-models-current.jar
On Tue, Nov 26, 2013 at 2:29 AM, mlalma notifications@github.com wrote:
I did a quick test code to try out the new sentiment model and noticed that there is something weird going on when using RNNCoreAnnotations.getPredictedClass().
I don't know if the sentiment analysis model included to 3.3.0 is different than on the live demo site ( http://nlp.stanford.edu:8080/sentiment/rntnDemo.html), but in any case the short test code is:
import edu.stanford.nlp.ling.CoreAnnotations; import edu.stanford.nlp.pipeline.Annotation; import edu.stanford.nlp.pipeline.StanfordCoreNLP; import edu.stanford.nlp.rnn.RNNCoreAnnotations; import edu.stanford.nlp.sentiment.SentimentCoreAnnotations; import edu.stanford.nlp.trees.Tree; import edu.stanford.nlp.util.CoreMap; import java.util.Properties;
public class SentimentTestAppStanfordNLP {
private StanfordCoreNLP pipeline; public SentimentTestAppStanfordNLP() { Properties props = new Properties(); props.setProperty("annotators", "tokenize, ssplit, parse, sentiment"); pipeline = new StanfordCoreNLP(props); } private void checkSentiment(String text) { Annotation annotation = pipeline.process(text); for (CoreMap sentence : annotation.get(CoreAnnotations.SentencesAnnotation.class)) { Tree tree = sentence.get(SentimentCoreAnnotations.AnnotatedTree.class); int sentiment = RNNCoreAnnotations.getPredictedClass(tree); System.out.println("Sentiment: " + sentiment + " String: " + sentence.toString()); } } private void doMain() throws Exception { checkSentiment("Radek is a really good football player"); checkSentiment("Radek is a good football player"); checkSentiment("Radek is an OK football player"); checkSentiment("Radek is a bad football player"); checkSentiment("Radek is a really bad football player"); System.out.println("-----------------------------"); checkSentiment("Mark is a really good football player"); checkSentiment("Mark is a good football player"); checkSentiment("Mark is an OK football player"); checkSentiment("Mark is a bad football player"); checkSentiment("Mark is a really bad football player"); } public static void main(String[] args) { try { SentimentTestAppStanfordNLP main = new SentimentTestAppStanfordNLP(); main.doMain(); } catch (Exception ex) { ex.printStackTrace(); } }
}
The output baffled me; in the cases of "Radek", the RNNCoreAnnotations seemed to give almost random output, whereas on "Mark" cases the outputs were pretty much as expected (see below). When I test these same sentences on the live demo site, the "Radek" cases are correct, not like what the CoreNLP outputs here.
Sentiment: 0 String: Radek is a really good football player Sentiment: 1 String: Radek is a good football player Sentiment: 2 String: Radek is an OK football player Sentiment: 2 String: Radek is a bad football player
Sentiment: 2 String: Radek is a really bad football player
Sentiment: 3 String: Mark is a really good football player Sentiment: 3 String: Mark is a good football player Sentiment: 2 String: Mark is an OK football player Sentiment: 1 String: Mark is a bad football player Sentiment: 1 String: Mark is a really bad football player
— Reply to this email directly or view it on GitHubhttps://github.com/stanfordnlp/CoreNLP/issues/7 .
Indeed, compiling the CoreNLP from source repository + the new (current) model fixes the issue. Thanks a lot!
I did a quick test code to try out the new sentiment model and noticed that there is something weird going on when using RNNCoreAnnotations.getPredictedClass().
I don't know if the sentiment analysis model included to 3.3.0 is different than on the live demo site (http://nlp.stanford.edu:8080/sentiment/rntnDemo.html), but in any case the short test code is:
The output baffled me; in the cases of "Radek", the RNNCoreAnnotations seemed to give almost random output, whereas on "Mark" cases the outputs were pretty much as expected (see below). When I test these same sentences on the live demo site, the "Radek" cases are correct, not like what the CoreNLP outputs here.