jmattthew / HeirloomRottenTomatoes

Google Chrome extension that adds features to the Rotten Tomatoes website
MIT License
6 stars 0 forks source link

Apply higher weighting for critic agreement with user on contentious films #5

Open Mushin opened 6 years ago

Mushin commented 6 years ago

I was wondering how the critic similarity is calculated - is it simply a comparison of all shared ratings between me and the critic? And secondly, how this comparison may be affected by which films I rate (not just the individual ratings).

The assumption I'm making here is that my ratings for more contentious films (i.e. with greater spread in ratings) are more informative for selecting critics with similar tastes to mine. However, if I rate many more universally highly-rated films than contentious ones, the information may be 'drowned out'.

In this case, I thought it could be more useful to first measure the spread of ratings for a film, in order to identify the more contentious films. Then, a weighting could be applied, with critics' agreement with me on contentious films given more weight than agreement on the less contentious films.

This strategy could, in theory, allow a better critic match to be found, regardless of which films are rated.

jmattthew commented 6 years ago

Very interesting idea! I'd certainly like to know if the theory is correct that contentious films make for a better measure for similarity between two people than non-contentious films. That certainly seems probable. It'll take me some time to get to that as it would be somewhat complicated.

FYI, here's how it works now: each film that you and a critic have both rated gets a similarity score based on how far apart your two ratings are. This scoring gives higher weight to 1-star and 5-star ratings. So a 2-star vs. a 4-star is considered more similar than a 3-star vs. a 5-star, even though their both 2 stars apart. Next, the app takes the average of the scores of each of those films, and assigns that average score to the critic. This is repeated for every critic, until every critic has a similarity score with you.

Note that the similarity score of any two critics is based on a partially non-overlapping set of movies. You and critic-A may have rated 10 of the same films, while you and critic-B may have rated 20 of the same films. Meanwhile, critic-A and critic-B may have only rated 5 of the same films. Critics are ranked based on a Beysean sort of their similarity to you and the count of films that you've both rated.

Mushin commented 6 years ago

I suppose it would be difficult to assess how well different methods for similarity search are working, apart from a subjective assessment. That is, unless you did cross-validation or similar to test the predictions within the dataset.