cjhutto / vaderSentiment

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
MIT License
4.43k stars 1k forks source link

Very different compound scores for similar emojis #94

Closed nathan-smit closed 4 years ago

nathan-smit commented 4 years ago

Hey there,

I've just started using this sentiment analysis tool and it's great! I came across a weird case though where I noticed that one of the lowest compound scores was the below tweet:

"Thank you I’m a happy clientπŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—πŸ’“πŸ’—"

Upon investigating further it seems that this heart πŸ’— has a positive compound score whereas this one πŸ’“ has a very negative score leading to a low overall composite score. Any reason for the inconsistent scoring between these fairly similar images? In my application I'm not converting the one heart to the other which leads to this being the most positive tweet in my dataset.

cjhutto commented 4 years ago

Oh, wow -- great find!
So, VADER's ability to score the sentiment of each emoji is accomplished by converting the emoji to it's official textual description, and then just processing that text as normal... this allows me to easily keep the emoji list up-to-date with the most modern set by simply scraping the official data source (here) whenever we need to update.

A quick informal inspection shows me that the first heart is a "growing heart" (strong positive sentiment) and the second one is a "beating heart"... and the context-free interpretation of "beating" is not at all positive (e.g., as in "this particular context-free interpretation is taking a beating in terms of sentiment accuracy").

The quick fix is to add "beating heart" as a special case so that this emoji is correctly interpreted to be positive... see my update on line 79 of the vaderSentiment.py script.

udaykumar1506 commented 4 years ago

Hey,

Is this issue fixed, I am getting Negative score for "Beating Heart" & Neutral Score for "Revolving Hearts"

Please help me here, thank you.

Screenshot 2020-05-22 at 12 55 30 PM
cjhutto commented 4 years ago

I've just pushed the updated version to PIP. You should be able to now pip install --upgrade vaderSentiment or python -m pip install vaderSentiment --no-cache-dir