jsoma / data-studio-projects

12 stars 18 forks source link

"That's what she said" - an analysis of every line of The Office #288

Open julialedur opened 5 years ago

julialedur commented 5 years ago

Pitch

Summary

This week I want to analyze dialogues from The Office. I found a data set containing every line that every character has said in every scene of every episode of the TV show. I want to look at who speaks the most, who each character mentions the most, and also do some text analysis to find repeating patterns on the characters' lines. I was inspired by this project by a former Lede student.

Details

Possible headline(s): "That's what she said" - an analysis of every line of The Office

Data set(s): this spreadsheet, found on Reddit. Original data source is this wonderful website.

Code repository: https://github.com/julialedur/data-studio/tree/master/code/06-the-office-lines

Possible problems/fears/questions:

I will have to refresh my memory on text analysis and maybe regex(?), even though what I'm aiming here isn't really rocket science. I will also have to learn how to make a sankey graph, to show the frequency with which the characters mention each other.

Work so far

I just changed my mind about my project topic and I have no graphs yet. But I'll come up with something on the next few days.

Checklist

This checklist must be completed before you submit your draft.

julialedur commented 5 years ago

Damn! Just discovered this piece by The Pudding. They've even done a tweet generator using a predictive text algorithm to write tweets from Dwight's mind.

jlstro commented 5 years ago

I'm not even assigned to write something here by the annoying bot, just stopped by to say that this is the coolest thing EVER.

grafik

julialedur commented 5 years ago

@jlstro hahahaha I knew you were gonna say something here. Was just waiting for it 😆 btw I accept suggestions of analyses that you, as a The Office fan, would like to see

jlstro commented 5 years ago

Oh no the gif isn't even looping :(

hakantan commented 5 years ago

just saw this!!!!!!!!!!!!!!!!!!!!!! WHAT! IS! UP!

kevinlitman-navarro commented 5 years ago

You all might like this less scientific analysis of The Office as well

sarahslo commented 5 years ago

ah the pudding was there. there still is room to do this. i'd like to see who talks the most in each show by season, and when new characters show up and drop out.

imagining a dot plot over time, with scaled dots...

julialedur commented 5 years ago

Update

lines-character-show

sentiment-analysis-vertical

Any changes in direction or topic?

Problems/Questions

Checklist

julialedur commented 5 years ago

Update

thats-what-she-said

Any changes in direction or topic?

No.

Problems/Questions

No.

Checklist

sarahslo commented 5 years ago

hi yes, i meant like a scatterplot. ok here are some tips & tricks screen shot 2018-08-27 at 9 27 31 pm . for the positive and negative you want to get the colors super clear, even put some white in between them for neutral.

if you want to colorize the images by sentiment, the best way to do it is to desaturate the photo first (in photoshop its image>adjust>desaturate and then in illustrator you overlay the color by going to transparency>overlay you'll get better results. nice work tho! (that's what she said.) screen shot 2018-08-27 at 9 32 49 pm