Humorloos / IE683

0 stars 0 forks source link

Write script for transforming Netflix Dataset to integrated schema #17

Closed Humorloos closed 2 years ago

Humorloos commented 2 years ago

Write a script that takes as input the IMDB Dataset and generates as output a pandas DataFrame that contains the columns from the integrated schema and an additional column with name 'source' and value 'Netflix'.

Specification of columns and datatypes of integrated schema can be found here: https://docs.google.com/spreadsheets/d/1yPAutI5u6qsJNLpSABiZa9zT9EYb1cai5we3Rna2Tpk/edit?usp=sharing

Deadline: 10-18

Humorloos commented 2 years ago

todo: Convert country availability to list[str]