Duplicates in dictionary: double entries with different sentiment values

Thanks for making VADER. I'm working on another port and am having a blast.

There are instances of words/emojis that have two entries with different sentiment values in the most recent version of vader_lexicon.txt. This is a potential source of bugs and inconsistencies between ports. I've included the list below with the line number in vader_lexicon.txt, the words, and the sentiment values.

It looks the Python version of VADER takes the last value it finds. For example, "lol" has two sentiment values: +2.9 at line 305, and+1.8 at line 4406. To reproduce the output in test sentence 13 from the main Readme (copied below), I need to assign "lol" a sentiment of 1.8.

Today only kinda sux! But I'll get by, lol----------------------- {'pos': 0.317, 'compound': 0.5249, 'neu': 0.556, 'neg': 0.127}

I see three main options:

Leave it as-is. This seems least desirable, since it leads to unpredictable and potentially inconsistent behaviour across instantiations.
Update the dictionary to match the current behaviour by removing each second instance of the 14 words below. This would be easy, but the potential downside is that some of the differences are big: e.g. "d:" has a positive instance and a negative instance, and "sob"'s larger value is more than double the smaller value.
Update the dictionary to match your intuition. A case-by-case approach wouldn't take long since there are only 14 instances, and a standard approach (e.g. averaging the two values) would also be simple.

Obviously it's your call, but I didn't see this in any other Issues or Pull Requests so I wanted to surface it. I'm happy to chat or help in any way I can.

line number	word	sentiment
120	:-p	1.2
124	:-p	1.5
227	d:	-2.9
1740	d:	1.2
230	d=	-3
1741	d=	1.5
234	fav	2.4
2831	fav	2
301	lmao	2
4399	lmao	2.9
305	lol	2.9
4406	lol	1.8
320	muah	2.8
4730	muah	2.3
342	o.o	-0.6
4853	o.o	-0.8
352	ok	1.6
4895	ok	1.2
385	sob	-2.8
6188	sob	-1
411	x-d	2.7
7489	x-d	2.6
412	x-p	1.8
7490	x-p	1.7
413	xd	2.7
7491	xd	2.8
417	xp	1.2
7492	xp	1.6

cjhutto / vaderSentiment

Duplicates in dictionary: double entries with different sentiment values #122