Humorloos / IE683

0 stars 0 forks source link

Prepare Outline #1

Closed Humorloos closed 3 years ago

Humorloos commented 3 years ago

4 pages

  1. Brief description of use case
  2. Explanation how the datasets fulfill the requirements 2.1. Schema and basic profile of each dataset
    • number of records per class
    • attributes with high percentage of missing values 2.2. Integrated schema and overlap with input schemata 2.3. Explanation why enough entities are likely contained in multiple datasets

To dos:

To Dos about schemas and basic profiles of the different datasets all include writing our findings in the outline already so that once we have completed all the tasks in the list, the outline can be submitted. image

Project Outline Overleaf Project

https://www.overleaf.com/7231126958ypyhswcppvbz

Dataset profiles google sheet:

https://docs.google.com/spreadsheets/d/1pCqePeyC0WAOadHi-ZeJj_InXw7fsYz0F4RTh-uoaLk/edit?usp=sharing

Integrated schema google sheet:

https://docs.google.com/spreadsheets/d/1yPAutI5u6qsJNLpSABiZa9zT9EYb1cai5we3Rna2Tpk/edit?usp=sharing

ashishrana160796 commented 3 years ago

Use case Description

For providing excellent service to customers, customer feedback plays a very important role. But, customer feedback and inputs are susceptible to biases and measurement errors. Modern day business processes are aware of such limitations and therefore, rather these customer experience enhancing processes are designed around data driven insights. Netflix was one of the first services in online content streaming services to exploit such user data level insights to provide great recommendations. In our project we will transform the user interaction with movies data on Netflix and supplement that with additional movie related information like revenues, actors, directors, ratings, synopsis, genre etc. This integrated dataset will assist in providing Netflix users more enhanced movie recommendation experience. Also, this data will further assist in generating user level insights for different movies and will assist in de-confounding the reasons for successful movie streaming numbers.

For example, we can more accurately determine whether the user's movie streaming decision on Netflix depends on movie revenue, rotten tomatoes or IMDb ratings, availability on other streaming platforms etc. As more movie revenue might mean that users would have seen the movie in the theater and it probably is not wise to immediately make it available on the platform by paying heavy streaming rights. Also, with insights from movie ratings and number of reviews from platforms like IMDb, rotten tomatoes etc. we can find highly coveted movies that might not have been widely viewed. Clearly, these insights at user level or movie level data granularity will proffer great insightful explanations for enhancing user experience and business revenues processes as well.

ashishrana160796 commented 3 years ago

Sharing the draw.io file link for future reference: https://stuma365-my.sharepoint.com/:u:/g/personal/asrana_students365_uni-mannheim_de/EdXM0LGCm-JPqI1qg1yaC2EBpbgZkTTVe7rxlAVR00324Q?e=R7x6Tt

Note: This will only work with the UMA microsoft id and in case of an feel free to ping me on this.