dasmith / stanford-corenlp-python

Python wrapper for Stanford CoreNLP tools v3.4.1
GNU General Public License v2.0
610 stars 229 forks source link

Sentiment Analysis Confidence Scores #52

Open shawnbeaulieu opened 7 years ago

shawnbeaulieu commented 7 years ago


For sentiment analysis I'm able to obtain the score that corresponds to the class with the highest estimated probability, but I'm unable to produce the estimations themselves (e.g. [very_negative = 0.60, negative = 0.25, neutral = 0.10, positive = 0.025, very_positive = 0.025]). I'd like to filter probabilities below a certain confidence threshold.

Thank you.

vacous commented 6 years ago

Hi, I had the same question but I think I figured it out with the setting: nlp = StanfordCoreNLP('http://localhost:9000') res = nlp.annotate(some_sentence, properties={ 'annotators': 'sentiment', 'outputFormat': 'json', 'timeout': 1000, }) The result: res is structured in the way: res['sentence'] = [result_first_sentence, result_second_sentnce, ..., result_last_sentence] and you can look into the result for each sentence, the result is output in a dict format with keys


And the 'sentimentDistribution' should be the one you are looking for so if you are interested in the sentiment distribution of the first sentence, then: res['sentences'][0]['sentimentDistribution']

AlexFine commented 6 years ago

@vacous What does the sentimentDistribution represent? I receive an array of five numbers? Do you know what those five numbers mean?

vacous commented 6 years ago

@AlexFine it represents the probabilities for "--(very negative)", "-", "0", "+", "++" sentiment in the sentence. Please see https://nlp.stanford.edu/sentiment/ for more details.

AlexFine commented 6 years ago

@vacous thanks