chrismbryant / amazon-confidence-interval

A browser extension which adds Bayesian visualizations to Amazon ratings.
MIT License
31 stars 4 forks source link

Run computations remotely? #3

Closed musicin3d closed 4 years ago

musicin3d commented 4 years ago

If the statistical calculations are computationally expensive, we could run them on a server. We could even cache the results for frequently accessed products.

This would also allow us to do the scraping server side, which means we could instantly deploy updates to the scraping process. I've been working on a scraping library backed by crawler. I've been refining the api as I work on two other projects that depend on it. It's nearing the first release.

I'd be willing to host the service during alpha. If the costs aren't too bad, I wouldn't mind donating it from them on.

(Out of scope) Bonus: If we want to get fancy, we could fall back to a cheaper calculation directly in the extension if the server is unavailable. We could have a configuration option that disables the server side lookup, relying solely on in-extension calculations.

chrismbryant commented 4 years ago

My initial thought is just to compute the "percent positive rating" and its 95% binomial confidence interval on page load by scraping the number of ratings for the product and the percent of ratings that were 4 or 5 stars. This is computationally quite cheap, so I don't see any reason it couldn't be run in-browser (although if we decide to do something really fancy, we might have to move in the direction you described).

After the statistics are computed, we'd then display a visualization next to the star rating of the percent positive rating and its confidence interval (or maybe even the full probability distribution?). I don't know what that UI would look like or what would be the most useful/fun to see.

This isn't quite what Grant had in his 3b1b video (with the +1 positive vote, +1 negative vote), but it's where my mind went.

musicin3d commented 4 years ago

Sounds like a good starting point. I think most people are just going to want to see the confidence score and the percent positive. Once we get this working we can play around with advanced options.

musicin3d commented 4 years ago

I'm closing this, because it's not needed at this point.