Open julialedur opened 5 years ago
Damn! Just discovered this piece by The Pudding. They've even done a tweet generator using a predictive text algorithm to write tweets from Dwight's mind.
I'm not even assigned to write something here by the annoying bot, just stopped by to say that this is the coolest thing EVER.
@jlstro hahahaha I knew you were gonna say something here. Was just waiting for it 😆 btw I accept suggestions of analyses that you, as a The Office fan, would like to see
Oh no the gif isn't even looping :(
just saw this!!!!!!!!!!!!!!!!!!!!!! WHAT! IS! UP!
You all might like this less scientific analysis of The Office as well
ah the pudding was there. there still is room to do this. i'd like to see who talks the most in each show by season, and when new characters show up and drop out.
imagining a dot plot over time, with scaled dots...
A lot of coding limitations!
It took me a while to figure out how to do the sentiment analysis, now I'm going to move to the chord diagram. However, I still don't know how to measure how many times the characters mentioned each other.
I'm also going to try to replicate the "That's what she said" graph by The Pudding.
@sarahslo do you mean like a scatter plot with the seasons on the x-axis and the number of lines on the y-axis? I like this idea, if that's what you mean
No.
No.
hi yes, i meant like a scatterplot.
ok here are some tips & tricks
. for the positive and negative you want to get the colors super clear, even put some white in between them for neutral.
if you want to colorize the images by sentiment, the best way to do it is to desaturate the photo first (in photoshop its image>adjust>desaturate
and then in illustrator you overlay the color by going to transparency>overlay
you'll get better results.
nice work tho! (that's what she said.)
Pitch
Summary
This week I want to analyze dialogues from The Office. I found a data set containing every line that every character has said in every scene of every episode of the TV show. I want to look at who speaks the most, who each character mentions the most, and also do some text analysis to find repeating patterns on the characters' lines. I was inspired by this project by a former Lede student.
Details
Possible headline(s): "That's what she said" - an analysis of every line of The Office
Data set(s): this spreadsheet, found on Reddit. Original data source is this wonderful website.
Code repository: https://github.com/julialedur/data-studio/tree/master/code/06-the-office-lines
Possible problems/fears/questions:
I will have to refresh my memory on text analysis and maybe regex(?), even though what I'm aiming here isn't really rocket science. I will also have to learn how to make a sankey graph, to show the frequency with which the characters mention each other.
Work so far
I just changed my mind about my project topic and I have no graphs yet. But I'll come up with something on the next few days.
Checklist
This checklist must be completed before you submit your draft.