cjhutto / vaderSentiment

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
MIT License
4.43k stars 1k forks source link

Same token in vader_lexicon.txt #95

Closed victorchennn closed 4 years ago

victorchennn commented 4 years ago

Hi,

I found there exists a pair of same tokens in vader_lexicon.txt, like there are two 'ok's in it, one with (ok 1.2 0.4 [1, 2, 1, 1, 1, 1, 2, 1, 1, 1]) and the other one is (ok 1.6 1.42829 [0, 0, 1, 1, 1, 4, 3, 4, 1, 1]), what's the difference? And which one to use?

Thanks!

cjhutto commented 4 years ago

The first ok (line 350) is in the context of emoticon use. The second ok (line 4893) is the typical shortened form of the text "okay". It's up to you, really... but if you aren't doing any contextual disambiguation in your script/analysis, then I would use the second one, as it likely represents the more "general" usage... plus, emoticons are a bit antiquated (being replaced with more modern emojis in today's social/digital communications).