cjhutto / vaderSentiment

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
MIT License
4.43k stars 1k forks source link

Added new lexicons used in CryptoTwitter #81

Open JaimeBadiola opened 5 years ago

JaimeBadiola commented 5 years ago

Updated lexicon of words to include common expresions used in CryptoTwitter. Sentiment scored by 10 independent users who are familiar with the expresions.

cjhutto commented 5 years ago

Thanks Jaime! This looks very promising, however, I'm having difficulty in the review as it appears (according to the diff analysis on GH) that you've changed every single line in each of the three most critical files used for VADER sentiment... can you point me to the specific changes (I think just some additions, correct?)

dandelionred commented 4 years ago

@cjhutto The files in the pull request use dos line ending that's why it looks like absolutely everything has been changed.

Here is the diff with line endings converted to unix:

diff -ur vaderSentiment.org/vader_lexicon.txt vaderSentiment.jaime/vader_lexicon.txt
--- vaderSentiment.org/vader_lexicon.txt    2020-08-19 16:07:36.382822907 +0300
+++ vaderSentiment.jaime/vader_lexicon.txt  2020-08-19 16:10:17.397542016 +0300
@@ -7514,4 +7514,26 @@
 }:(    -2.0    0.63246 [-3, -1, -2, -1, -3, -2, -2, -2, -2, -2]
 }:)    0.4 1.42829 [1, 1, -2, 1, 2, -2, 1, -1, 2, 1]
 }:-(   -2.1    0.7 [-2, -1, -2, -2, -2, -4, -2, -2, -2, -2]
-}:-)   0.3 1.61555 [1, 1, -2, 1, -1, -3, 2, 2, 1, 1]
\ No newline at end of file
+}:-)   0.3 1.61555 [1, 1, -2, 1, -1, -3, 2, 2, 1, 1]
+bulls  1.9 1.86
+bull   1.8 1.682
+bullish    2.3 1.5798
+whales -1.1    1.9138
+support    1   1.9322
+resistance 0.3 2.1756
+bear   -1.3    1.8797
+bearish    -1.4    1.2042
+short  -0.8    1.5213
+long   1.3 1.6375
+bounce 1.1 1.6854
+rekt   -2.2    2.4404
+arbitrage  0.4 1.9633
+manipulation   -2.7    1.2721
+bot    -0.9    2.1833
+strategy   1.5 1.9679
+SEC    0   1.4142
+regulations    -1.2    1.6865
+FUD    -1.9    1.912
+ICO    -0.4    2.1705
+CNBC   -2.1    2.0276
+hodl   0   2.357
\ No newline at end of file
diff -ur vaderSentiment.org/vaderSentiment.py vaderSentiment.jaime/vaderSentiment.py
--- vaderSentiment.org/vaderSentiment.py    2020-08-19 16:07:36.378823238 +0300
+++ vaderSentiment.jaime/vaderSentiment.py  2020-08-19 16:10:17.397542016 +0300
@@ -18,7 +18,6 @@
 import json
 from itertools import product
 from inspect import getsourcefile
-from io import open

 # ##Constants##

@@ -72,7 +71,10 @@
                           "back handed": -2, "blow smoke": -2, "blowing smoke": -2,
                           "upper hand": 1, "break a leg": 2,
                           "cooking with gas": 2, "in the black": 2, "in the red": -2,
-                          "on the ball": 2, "under the weather": -2}
+                          "on the ball": 2, "under the weather": -2, "Bull Market": 2.3,
+                          "All time high": 2.3, "Trading analysis": 1, "Short squeeze": 0.6,
+                          "Closing a long": 1.6, "Closing a short": -0.1, "Opening a long": 1.3,
+                          "Opening a short": 0.9, "flip a coin": 0.6}

 # check for special case idioms containing lexicon words
 SPECIAL_CASE_IDIOMS = {"the shit": 3, "the bomb": 3, "bad ass": 1.5, "yeah right": -2,