Think about a way to combine sentiment analysis/emotion mining with prediction

PythEsc commented 7 years ago

Check if we can find a correlation between the sentiment analysis/emotion mining results and users reactions on Facebook
If this correlation exists implement a way to use those analysis results for the prediction of the reactions

Naxter commented 7 years ago

For this task, there would be a mapping between emotions->reactions needed and also reactions->sentiment

PythEsc commented 7 years ago

I've just ran some code that calculates the correlation between the results of our emotion mining and the actual users' reactions.

Create correlation matrix with dimension reactions * emotions (Ignoring the likes): 5x8 matrix
Create correlation_counter matrix with same dimensions
Iterate over all posts that have an existing result for emotion mining
Skip all posts that do have less than 10 total reactions
Calculate the ratio of reactions
Calculate the ratio of emotions
Calculate the absolute difference between each reaction-ratio and each emotion-ratio and sum the ratios in the correlation matrix
Increase correlation_counter matrix entry whenever the difference is not zero
After iteration divide the correlation matrix by the correlation_counter matrix

Average absolute difference

	ANGER	ANTICIPATION	DISGUST	FEAR	JOY	SADNESS	SURPRISE	TRUST
LOVE	0.881	0.609	0.962	0.890	0.600	0.749	0.696	0.616
WOW	0.180	0.384	0.185	0.199	0.359	0.206	0.236	0.380
HAHA	0.815	1.084	0.915	0.826	1.010	0.788	0.851	1.113
SAD	0.247	0.789	0.242	0.293	0.723	0.346	0.469	0.757
ANGRY	0.636	1.007	0.657	0.674	1.028	0.720	0.890	0.977

Unfortunately, one cannot see any good correlation between the emotion mining results and the actual reactions... Either there is still a bug in my code or the emotion mining and the reactions really do not have any correlation.

PythEsc commented 7 years ago

I've just added the "help wanted" flag because I'd like somebody to do a review of the "emotion_analysis.py" file. Want to be sure that those results are not just some dumb bug ;) When whoever has done the review you can write your comments here and remove the flag again

EDIT: We could try to find correlation between the emotions and the reaction using the other emotion-lexicon that Jerry mentioned. If this still does not help and we cannot find a bug in my evaluation code we have to evaluate (somehow) the precision of our emotion miner. In my opinion there should be a visible correlation between the reactions and emotions, but since it is not visible in the current statistic there are three reasons for that:

The analysis is wrong
The predicted emotions are wrong/not precise because the lexicon does not fit
The predicted emotions are wrong/not precise because the code is too simple and does not consider e.g. negation handling

EDIT2: I've just saw when looking into the lexicon that it is a sentiment and not emotion lexicon. Therefore we cannot improve our emotion miner with that one :(

jerryspan commented 7 years ago

Well, I am a bit confused about the "average absolute difference" / AAD. What are the pure correlation numbers?

e.g. I guess it's good that "angry" has the lowest AAD with "anger". At the same time, "sad" should have the lowest AAD with "sadness" ?

Would it help to look at some handpicked posts, to see what is going on (and why...)? Could it be because not many words are related to an emotion (or sentiment)? Am not sure if I mentioned it before but there is a recent approach (see attached) that uses pre-trained word embeddings + emotion lexicon to "annotate" a corpus.

Exploiting a Bootstrapping Approach for Automatic Annotation of Emotions in Texts.pdf

PythEsc commented 7 years ago

Well, I am a bit confused about the "average absolute difference" / AAD. What are the pure correlation numbers? e.g. I guess it's good that "angry" has the lowest AAD with "anger". At the same time, "sad" should have the lowest AAD with "sadness" ?

Yes exactly the lower the value the better the match. When looking at the row: angry "matches" anger since it has the lowest value (which is still quite high) in the row. When looking at the column: WOW matches anger even better (since it has a value of only 0.180) which does not make much sense?

Its "averaged" because I divided the summed difference by the number of differences that lead to that sum. Whenever the difference between the two ratios (emotion_ratio and reaction_ratio) of one post is not zero I increased that counter by one and added the difference to the total difference. At the end I divided that total difference by the value of the counter.

Would it help to look at some handpicked posts, to see what is going on (and why...)? Could it be because not many words are related to an emotion (or sentiment)?

Yes I guess we have to label some data manually to compare the results of our emotion miner with the handcrafted results and calculate some performance score to evaluate the miner. I really don't know where the error is located exactly because there are so many possibilities (wrong results because we do not have negation handling, unbalanced lexicon e.g. maybe there are a lot more words for one emotion than for another, maybe the lexicon is simply wrong or does not work for our domain, maybe my evaluation is wrong, ...)

Naxter commented 7 years ago

I have already thought about bootstrapping our labels for the emotions by using a similar approach. The approach of them is similar to our approach. They use CoreNLP and EmoLex. This sounds promising but I wonder: if also taking negation handling in this approach into consideration, the results would get better? I may have to think about that.

But indeed, would be interesting to see if this bootstrapping approach would improve the results.

PythEsc commented 7 years ago

This sounds promising but I wonder: if also taking negation handling in this approach into consideration, the results would get better?

Well at the moment a sentence like "Your salad is really not delicious" would get a positive sentiment/emotion since delicious is associated with positive emotions/sentiment. I think that would slightly improve the results but I am not sure if that alone is enough. I guess we really have to label some data manually and see if our miner fits those labels.

In general we do not have any information about the validity of our sentiment/emotion analysis since we have no real labeled data. Maybe the results are pretty much random and hence not better than any baseline. I guess the sentiment analysis has a higher accuracy since we are using CoreNLP for that task which does a lot more than a simple lookup in a dictionary.

jerryspan commented 7 years ago

Isn't "not" yielding some negativity at least?

Naxter commented 7 years ago

EmoLex does not cover this kind of words and we did not include any negation handling or the emotion mining.

The Stanford CoreNLP for sentiment analysis https://stanfordnlp.github.io/CoreNLP/sentiment.html#description proposes on this information page:https://nlp.stanford.edu/sentiment/ that this implementation does consider negation handling. But still the result is a negative outcome. Maybe it is really necessary to get in touch with some own implementation/own usage of a negation handler framework to get better results?

jerryspan commented 7 years ago

Stanforld NLP sentiment analysis is state-of-the-art and due to recursive/dependency parser structure they are using, they are able to handle negations as well.

GIven the nature of social media text, I think you should include somehow negation handling in your (baseline?) model as well.

There should be some literature on this as well: https://www.aclweb.org/anthology/W/W10/W10-3111.pdf http://www.aclweb.org/anthology/W15-2914

and perhaps some "lists": http://ptrckprry.com/course/ssd/data/negative-words.txt

BTW, the StanfordNLP for the phrase "Your salad is really not delicious" concludes that is "neutral".

PythEsc commented 7 years ago

BTW, the StanfordNLP for the phrase "Your salad is really not delicious" concludes that is "neutral".

Hmm I tested the same and for me it was "NEGATIVE". Since I am not at home at the moment I used the online CoreNLP API

Stanforld NLP sentiment analysis is state-of-the-art and due to recursive/dependency parser structure they are using, they are able to handle negations as well.

Yeah that's what I thought. I guess the sentiment results are fine but our emotion mining is not. Either we'll have to find a library that supports emotion extraction in a pipeline or have to write our own pipeline including negation handling and the additional stuff that we are already doing (e.g. lemmatizing)

jerryspan commented 7 years ago

I used this one: http://nlp.stanford.edu:8080/sentiment/rntnDemo.html

screen shot 2017-06-14 at 15 53 09

PythEsc commented 7 years ago

Ah ok well that's pretty much neutral with a low tendency to negative maybe that's why the site that I used recognizes it as negative.

PythEsc commented 7 years ago

We could use the MPQA Opinion Corpus for evaluation of our emotion miner. I haven't read the whole description yet but it looks promising

Naxter commented 7 years ago

I found out, that the problem with different results in the online demo and the version in the corenlp package might be caused by an outdated trained model in the corenlp package. One guy mentioned that he rebuilt the model for 24 hours (...) for like 750 epochs and came close to the online demo version.

So yeah, I guess thats what we should also do. Or we email these guys and ask for their actual model.

jerryspan commented 7 years ago

How are you planning to use the MPQA corpus? Are there baseline results on this?

Naxter commented 7 years ago

Bruno implemented a linear model to combine both nns and the emotions.

PythEsc / Research_project2

Think about a way to combine sentiment analysis/emotion mining with prediction #22