dasmith / stanford-corenlp-python

Python wrapper for Stanford CoreNLP tools v3.4.1
GNU General Public License v2.0
610 stars 229 forks source link

Sentiment Analysis Confidence Scores #52

Open shawnbeaulieu opened 7 years ago

shawnbeaulieu commented 7 years ago

Hello,

For sentiment analysis I'm able to obtain the score that corresponds to the class with the highest estimated probability, but I'm unable to produce the estimations themselves (e.g. [very_negative = 0.60, negative = 0.25, neutral = 0.10, positive = 0.025, very_positive = 0.025]). I'd like to filter probabilities below a certain confidence threshold.

Thank you.

vacous commented 6 years ago

Hi, I had the same question but I think I figured it out with the setting: nlp = StanfordCoreNLP('http://localhost:9000') res = nlp.annotate(some_sentence, properties={ 'annotators': 'sentiment', 'outputFormat': 'json', 'timeout': 1000, }) The result: res is structured in the way: res['sentence'] = [result_first_sentence, result_second_sentnce, ..., result_last_sentence] and you can look into the result for each sentence, the result is output in a dict format with keys

['index',
 'parse',
 'basicDependencies',
 'enhancedDependencies',
 'enhancedPlusPlusDependencies',
 'sentimentValue',
 'sentiment',
 'sentimentDistribution',
 'sentimentTree',
 'tokens']

And the 'sentimentDistribution' should be the one you are looking for so if you are interested in the sentiment distribution of the first sentence, then: res['sentences'][0]['sentimentDistribution']

AlexFine commented 6 years ago

@vacous What does the sentimentDistribution represent? I receive an array of five numbers? Do you know what those five numbers mean?

vacous commented 6 years ago

@AlexFine it represents the probabilities for "--(very negative)", "-", "0", "+", "++" sentiment in the sentence. Please see https://nlp.stanford.edu/sentiment/ for more details.

AlexFine commented 6 years ago

@vacous thanks