To start getting our hands dirty, we need to collect and prepare the data. Good starting point will be supplementing Netflix Prize dataset with IMDB metadata: descriptions, directors, actors, etc.
Let's use MovieLens 25M instead of Netflix Prize. Users' metadata (e.g. occupation, sex and age) seems to be beneficial for both models and the final report.
To start getting our hands dirty, we need to collect and prepare the data. Good starting point will be supplementing Netflix Prize dataset with IMDB metadata: descriptions, directors, actors, etc.