Open chessai opened 5 years ago
If it's too slow to recompute the PageRanks every time someone uploads a package, you could implement Fast Incremental and Personalized PageRank instead.
Users could see a numerical number roughly reflective of the community's trust of the package
While I agree that the 3-star rating isn't a sufficient metric (and curiously there's been cases of politically motivated downvoting on Hackage already, but that's just something you have to live with), I think that claiming PageRank to be a metric of trust is a very misleading premise. While you didn't specifiy exactly how you'd apply PageRank to the Cabal metadata, PageRank is merely a metric of popularity, but certainly not of "trust". It only applies to dependencies maintainers voluntarily depend on, but not those that are forced upon you due either lack of alternatives or due to other choices you made which vendor lock you into that choice. Then there's also effects of cargo-culting where people are just not aware of the alternatives, and a PageRank metric might even reinforce this vicious cycle by making people less confident about walking lesser travelled roads. And fwiw, I can think of a couple of packages which I certainly wouldn't classify as trustworthy and yet they appear in a majority of install-plans across Hackage.
That being said, I welcome adding a PageRank-like metric as an additional number to look at or that you can sort by, but I don't consider it a replacement for the manual user rating metric.
Pagerank in this case is being described as a sort of fancy-weighted way of summing over transitive reverse dependency counts, right? So the first step would be for somebody to jump in and help finish the long-delayed reverse-dependency code that now exists at https://github.com/haskell/hackage-server/pull/723
This code is entirely feature-complete, but appears to still consume excessive space in-memory when at full hackagedb scale. With that in place, it would be straightforward in code (but perhaps interesting mathematically) to augment the revdep information further with incremental pagerank data.
But that said, given the structure of dep-graphs in Haskell, I'd be curious if pagerank actually provided a value-add over revdep counts themselves. However, such a question is best answered empirically, by actually implementing things and seeing what happens :-)
The 3-star rating of packages has historically been not very useful. Even the most well-known of packages receive even a small number of ratings from users. For example:
Implementing a rating system based on pagerank could help in the following ways:
@taktoa and I discussed this and we both think this is a better approach over rule of succession/bayesian averaging/anything that relies on explicit user voting. @taktoa please comment if you have anything to add.