Open Nek opened 7 years ago
Hi Nek, this is not an issue in this project code. I've tested[1] it and it's an issue with the original Python implementation from Hutto and the NLTK team. I recommend that you point it out to them so we can all benefit from the improvement.
Anyway, thanks a lot for the report! I'll mark it as an improvement to be scheduled. And yes, I also picked VADER because I'm integrating it with another project :-)
Cheers!
[1]
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.sentiment.vader import SentimentIntensityAnalyzer
>>> sid = SentimentIntensityAnalyzer()
>>> ss = sid.polarity_scores("I don't feel so good")
>>> for k in sorted(ss):
... print('{0}: {1}\t'.format(k, ss[k]), end='')
...
compound: 0.5777 neg: 0.0 neu: 0.445 pos: 0.555 >>>
>>>
Hi, I looked into this issue. The issue arises from this line which is the implementation of this.
If you use a statement like I don't feel completely good
, you can get correct result i.e. {negative=0.466, neutral=0.534, positive=0.0, compound=-0.3865}
. If you have so
in place of completely
, the valence is multiplied by a 1.25
, otherwise it is multiplied by -0.74
.
I made a few changes here but they will break tests and also some checkstyle
rules.
I basically added a rule which checks in a if the trigram has <negative word> <some word | so | this> <so | this>
., if that occurs accordingly adjust the score. Previously, we were only handling <never> <some word | so | this> <so | this>
. After this you'll get {negative=0.466, neutral=0.534, positive=0.0, compound=-0.3865}
.
Hi Animesh, thanks for taking the time to try a fix/enhancement to this issue.
I took a look at your changes. I believe it will be better to recode it without breaking the tests... ;-) otherwise, we will have to guarantee (by extensive testing) that the implementation is indeed better than the original by Hutto!
Cheers and tell me what do you think...
Thanks for picking this project up. It's a good fit for an app prototype I'm building. Right now I'm playing with the library from Clojure REPL. Works fine except there are some problems with "so".