Assessment 3 data: @krsty

For the third assessment I am going to use data of curse words used in Tarantino films:

movie	type	word	minutes_in
Reservoir Dogs	word	dick	0.40
Reservoir Dogs	word	dicks	0.43
Reservoir Dogs	word	fucked	0.55
Reservoir Dogs	word	fucking	0.61
Reservoir Dogs	word	bullshit	0.61
Reservoir Dogs	word	fuck	0.66
Reservoir Dogs	word	shit	0.90
Reservoir Dogs	word	fuck	1.43
Reservoir Dogs	word	dicks	1.56
Reservoir Dogs	word	fuck	1.66

I'm using this dataset because it has a lot of data that can be used in different ways. This way I can make multiple visualisations. I also find it interesting because I love Tarantino films.

Visualisation ideas

The dataset has timestamps of when the words are said, so I can make a visualisation with a timeline. I can also try and filter out a certain word, and make it apparent in which movie that word was said the most. Maybe there are several trends to be discovered in Tarantino's use of bad words, atleast that's what I am hoping to find.

Type of visualisation

For the filtering of words I am thinking of using a line chart. With a line chart it should be easy to see which words were said the most in which movie. I could maybe also use a donut or pie chart for this.

For the timeline i'm thinking of using a scatter plot, because you can show more than 3 variables, time, movie and what word was said.

cmda-tt / course-17-18

Assessment 3 data: @krsty #542

Contents

Visualisation ideas

Type of visualisation