homomorfism / data-mining-project

1 stars 0 forks source link

Data Collecting #2

Open DanisAlukaev opened 2 years ago

DanisAlukaev commented 2 years ago

To start getting our hands dirty, we need to collect and prepare the data. Good starting point will be supplementing Netflix Prize dataset with IMDB metadata: descriptions, directors, actors, etc.

DanisAlukaev commented 2 years ago

Let's use MovieLens 25M instead of Netflix Prize. Users' metadata (e.g. occupation, sex and age) seems to be beneficial for both models and the final report.