DisasterMasters / TweetAnalysis

Repository for storing the code used to analyse the tweets collected from the Twitter scraper
2 stars 3 forks source link

Coding #9

Open audrism opened 5 years ago

audrism commented 5 years ago

Sai did batch3 which (see data channel) (emotion+sentiment+8 types of emotion): 180

Alexa did two classes 1-8 and 20-29 : 130 tweets first week

saithat commented 5 years ago

Did some more coding on 2/25: at 220 now Did more coding on 2/27: number 501-729, emotion+sentiment 3/1 and 3/4: tweets 1-223, 501-600 coded for relevance, emotion, sentiment, content tweets 601 - 729 coded for relevance, emotion, sentiment 3/6 and 3/8: As of now relevance has been coded for tweets 1-746, emotion+sentiment+content for tweets 1-249 and 501 - 729 3/11: As of now relevance has been coded for tweets 1-746, emotion+sentiment+content for tweets 1-320 and 501 - 729 3/13: relevance has been coded for tweets 1-912, emotion+sentiment+content for tweets 1-320 and 501 - 729. Changed some of the content coding to match minor changes made to categories. 3/25: relevance coded for all tweets, emotion+sentiment+content for tweets 1-354 and 501 - 729 3/27: relevance coded for all tweets, emotion+sentiment for tweets 1-729 and content for tweets 1-398 and 501 - 729 3/29: relevance coded for all tweets, emotion+sentiment+content for tweets 1-729

atipton commented 5 years ago

Coded around 50 tweets on content analysis on 2/25

atipton commented 5 years ago

Did some manual coding for content, about 200 tweets on 2/26.

I coded >200 for relevant or irrelevant and most were irrelevant, about 80%. Then I went back and coded only the relevant ones for content

syd-shelby commented 5 years ago

Coded 550 Tweets. Coding for relevance, then emotion, sentiment, and opinion. Only about 20% of these have been relevant

audrism commented 5 years ago

Please do cross-rater reliabaility for coding

Add comments (eg., this cat sucks)

Report percentages

audrism commented 5 years ago

Alexa will use Manny's relevant model totrain on her and mannys data

syd-shelby commented 5 years ago

I took 699 tweets coded by Faiza and myself and compared how closely we coded relevance, whether or not it contained emotion, and the specific emotion. For ~81.3% of the tweets, we agreed on relevance. For ~85.7% of the tweets that we agreed had relevance, we agreed on whether or not it had emotion. For ~44.3% of the tweets that we agreed were relevant and had emotion, we agreed on the specific emotion. This last number is a bit low, but it makes sense for a few reasons. First of all, i tdoesn't take into account multiple labels. For example, if I marked a tweet as angry, and she marked it as angry and sad it wouldn't count. Second of all, with some of the emotions that are similar in nature we both tend to use one more often. For example, I was more likely to label an emotion as angry while she was more likely to label one as disappointing.

atipton commented 5 years ago

Finished coding a batch of 1800 tweets for relevance, also did a little bit (~90 tweets) for Xiaojing on content. Friday (03/01) I worked with Manny on editing code specifically to train on these tweets to find relevance, so I will work on testing that tomorrow.

atipton commented 5 years ago

Right now I am getting an accuracy of about ~80% on relevance training so I am working with Manny on improving accuracy.