TMDB Movie Plot Keywords Visualization

A project that visualizes most frequent plot keywords in movies from TMDB 5000 Movie Dataset on Kaggle.

Deployed on github pages at https://anqi-lu.github.io/TMDB-keywords/.

The TMDB 5000 Movie Dataset contains information for about 5000 movies from The Movie Database(TMDB), a crowd-sourced movie information database. The movie information includes movie genres, country, actors, directors, plot keywords, gross profit, and much more.

Prototyping

The first visualization I made on this dataset was a donut chart, illustrating the number of movies by country. It shows the which countries are the movies from. Because TMDB is built entirely by the user so there could be a lot of bias in the data (Most movies are from western countries). Therefore, I decided that I was not going to do much with the “country” attribute of the data. Instead, I decided to focus on the “plot_keywords”, “genres”, and “year”.

The questions about this dataset became:

What plot keywords are the most common among all the movies?
How do most common keywords vary according to genre?
Are there trends in keywords over time?

According to the questions, I listed the following tasks:

Word cloud of top 20 key words for all genres
Genre filter — display top 20 keywords for each genre 3. Line chart of count by year
Line chart of count by year

Sketch

sketch

Source of Inspiration: 60 years of french first names and stream-graph explorer

This sketch brings the additional tasks:

Select multiple keywords and assign different color for each line in the line chart
Hover to select a year and corresponding tooltip
Zoom by brushing on the year

Interactions

Word cloud on the left
- Bigger the size, more frequent the word appeared
- Click on a word, draws a line on the line chart on the right

Checkboxes — filter for genre below word cloud
- Only one can be selected
- Word cloud updated to the top 10 keywords from the selected genre

Line Chart on the right
- x-axis: year, y-axis: frequency
- Can draw multiple lines with multiple words selected

On the line chart, hover over the year shows a tooltip with the frequency of the selected words at the year being hovered over.

Future Work

Add animation to have smooth transitions
Respond to resize

Attribution

This D3.js web project is forked from curran's this template project.

anqi-lu / TMDB-keywords

readme