Rostlab / JS16_ProjectD_Group4

Joffrey Baratheon is one of the most loathed characters in TV history. As a matter of fact people were celebrating his TV death on Twitter. We are interested to learn more on how people feel about different characters by analyzing tweets mentioning GoT characters. In this project you will be analyzing Twitter feeds across a timeline, you will look for the name of GoT characters in that feed and try to identify whether the tweet is positive or negative. You can then generate a metric that evaluates what is the accumulated sentiment expressed on Twitter for that given character at a given point in time, and what is the trend (positive, negative). It will be interesting to intersect the sentiments for characters following the airing of a certain episode (you can easily get the airing date for an episode from the database constructed in Project A).
GNU General Public License v3.0
0 stars 1 forks source link

Number of tweets analyzed #58

Closed gyachdav closed 8 years ago

gyachdav commented 8 years ago

Fellas, as part of the media blitz we're planning there will be a press release that will throw some big numbers at the readers. Can you provide some impressive statistics about the data your tools processed e.g. our crawler fetched 2M tweets and 10M sentiments keyword processed. Any thing that you think might be interesting IS interesting.

julienschmidt commented 8 years ago

I believe this isn't urgent?

Blocking issues: #42 #47

julienschmidt commented 8 years ago

We'll add an JS API function for that. We just need to add up the values from all characters.

julienschmidt commented 8 years ago

gotsentimental.stats() ⇒ Promise.<Object>

julienschmidt commented 8 years ago

Currently we have about 1.8 million tweets in our database. For about 44% we were able to determine a sentiment (the sentiment detection still needs a lot of work).

sacdallago commented 8 years ago

If it's computing power you are after, let me know

sacdallago commented 8 years ago

Just saying:

marcusnovotny commented 8 years ago

Filthy casual

julienschmidt commented 8 years ago

cry

sacdallago commented 8 years ago

:D That's cute guys... but that was just one of three machines :laughing:

marcusnovotny commented 8 years ago

@julienschmidt We should change the names of our Stats function. Tywin Lannister doesn't have a lot of friends in the fanbase yet ends up on Spot 2 on Most Popular. Grey Worm is universally liked yet number 1 Hated. The scores come from people enthusiastically tweeting about Tywin's death and episodes when Grey Worm was thought to be in great danger.

We should just call it "Highest Positive Scores" & "Highest Negative Scores" and then provide a small explanation on the About Page, what we're meaning with Sentiment Scores. That it's hard to tell wether someone likes a character just from a tweet for us and we're actually measuring how people express their feelings on the events.

Example: "I fucking hate Tywin Lannister, his death serves him right": We measure angriness "I'm extremely glad Tywin Lannister is gone now, he only held his family back": We measure relief

Yay / Nay?

julienschmidt commented 8 years ago

aggregator/aggregator.js#L208 Fix it

marcusnovotny commented 8 years ago

That's why I proposed the new names, I don't know what else to put?

julienschmidt commented 8 years ago

Both popularity and heat should e.g. value recent tweets more than the total

marcusnovotny commented 8 years ago

That still doesn't fix my problem that a positive sentiment score on a tweet doesn't equal that character being popular and vice versa

julienschmidt commented 8 years ago

Depends on your definition of "popular". I think a lot of positive mentions is a valid definition.

We could also include how positive / negative sentiment is.

julienschmidt commented 8 years ago

After all our goal is to show such surprises. If they don't like a character, maybe they should tweet about it...

marcusnovotny commented 8 years ago

Well I guess you can see it that way too. When you check why Tywin Lannister is popular and see that the positive spike is from the episode he dies in, it's actually quite funny :smile:

gyachdav commented 8 years ago

Now that all is said and done. can you give us some numbers? examples:

gyachdav commented 8 years ago

Status?

marcusnovotny commented 8 years ago

Hey Guy!

Our db wasn't completely exhaustive. I'm currently crawling missing characters and am at around 2.75 million tweets with 1 million analyzed, numbers still rising. Of those 2.75 million, we were able to analyze 1 million: 600k are postivie, 420k negative. I'll provide a dump for Project F as soon as I'm done.

Peak season / episode varies for each character. If a character is only active in season 2-3 for example, he gets mostly mentioned then. The most extreme peak is for Jon Snow at the season 5 finale though. Apart from spikes like that, normally the number of tweets rises with each season, since more and more people join Twitter I suppose

sacdallago commented 8 years ago

Hey guys! Wasn't there a megadatabase going around somewhere? Something I could use on the www.got.show page?

marcusnovotny commented 8 years ago

Currently crawling the very last characters. Hope to have it up by tonight, tomorrow the latest

sacdallago commented 8 years ago

Nice!!

marcusnovotny commented 8 years ago

DB dump coming tomorrow afternoon. Pure excitement :grin:

marcusnovotny commented 8 years ago

Just spotted Lord Varys and Khal Drogo on the blacklist. Currently crawling those, afterwards I'll upload the dump. 2 hours max I guess

marcusnovotny commented 8 years ago

Still crawling Drogo, I fear there's a load of Spanish tweets in there. Let's see how it turns out. Update definitely coming today.

sacdallago commented 8 years ago

:zzz:

marcusnovotny commented 8 years ago

Still on it. Daenerys had a hole in the data. Think I'll drop Drogo, there seem to be a lot of unrelated tweets about him. Will upload afterwards.

sacdallago commented 8 years ago

K :) Just remember to give me the db access later :) And tomorrow we can try it out on www.got.show

sacdallago commented 8 years ago

P.S.: I'll go to bed now coz I'm dead, so don't expect me to answer before tomorrow at some stage :P noonish

marcusnovotny commented 8 years ago

If you don't need it tonight I might crawl Drogo too

sacdallago commented 8 years ago

Nah, take your time! I mean, I'm still waiting for so many other things that I'm an induced procrastinator now.

marcusnovotny commented 8 years ago

That's the spirit.

sacdallago commented 8 years ago

Yeah, and to think that I used to have a life...

marcusnovotny commented 8 years ago

Conceal, don't feel, don't let them know :broken_heart:

julienschmidt commented 8 years ago

Included in the Report