The rating paradigm - Githubissues

CGastrell commented 9 years ago

Hi all.

I love this idea, how clean and minimalist it is.

I do worry about how new packages are treated by the calculations. In the very beginning of a package's life it can't (or shouldn't) be treated as a low quality package.

For the sake of clarity, let's assume a package has just been published. No matter how good or well coded it is, it will be a zero-star package assuming most of its considered stats are in zero:

DOWNLOADS = 0;
VERSIONS = 1;
TOTAL_ISSUES = 0; //maybe the package is SO good it doesn't have any issues (just maybe)
OPEN_ISSUES = 0; //idem above
LONG_ISSUES = 0; //because above

v = 1 - 1 / VERSIONS;
// 1 / 1 = 1; 1 - 1 = 0; zero rating so far
d = 1 - 1 / DOWNLOADS;
// 1 / 0 = error; 1 - error = ?; zero rating remains

rt = 1 - 1 / TOTAL_ISSUES;
// 1 / 0 = error; 1 - error = ?; zero rating remains
ro = 1 - 1 / OPEN_ISSUES;
// 1 / 0 = error; 1 - error = ?; zero rating remains
rlo = 1 - 1 / LONG_ISSUES;
// 1 / 0 = error; 1 - error = ?; zero rating remains
r = (rt + ro + rlo) / 3; //zero again

So, how does a package relates to its quality by this calculations? Well, it doesn't.

Quality shouldn't be confused with popularity or maintained package. This stats are only checking for "how much" (a quantitative measure) and not for "how good" (qualitative).

I think a harder work has to be applied to the ratings and how are they calculated. Things like downloads shouldn't be taken on quality measures unless we can find a relation with other packages depending on the one being measured. Documentation is, by far, one of the most important things in a package quality (besides code itself) and is not even mentioned here.

Let's face it, even if someone wrote the perfect package, it would receive zero star rating because it doesn't have any downloads, or because it doesn't have any issues. A package could be perfect (not often) and not need any issues, yet it would still receive a zero star rating.

I'd love to see this tool go further into a deep analysis of the package and post a, relatively, improved version of the ratings.

CGastrell commented 9 years ago

Maybe use git-hours?

alexfernandez commented 9 years ago

There is a huge difference between quality and quality assurance; the industry has largely shifted towards the latter. Keep in mind that package-quality tries to measure quality assurance and not quality.

As an example, imagine one package that has no tests; the quality of the code might be outstanding, but would you trust it? Now imagine two packages, identical except that one has thorough test coverage and the other doesn't; which one would you trust more?

The measurements used in package-quality are similarly used to gauge quality assurance: a package that is excellent in every aspect but has not been battle-tested, will probably not be as trustworthy as another one that may not be so great but which is widely used in the field and keeps its issues in order.

That said, if you know about a robust way to measure quality per se, we accept pull requests :)

anywhichway commented 9 years ago

I agree on the newness issue. Additionally, the current algorithm can be spoofed by artificially incrementing version numbers. Perhaps some ratio of versions to age is within a normal range?

guitarmanvt commented 8 years ago

As a new package author, I was shocked to find my package rated as "crap", with not one single bit of detail provided about why it got such a low score. My gut-level reaction is to consider package-quality "crap" and to bad-mouth it on Twitter. (Don't worry, I won't.) I hope it's not hard to understand that creating bad blood with new package authors is not helpful for your package quality measurement tool.

My constructive, friendly criticism is this:

Remove the word "crap" from the GitHub README, or don't rate everything <4 as "crap". A little diplomacy here would go a long way.
More importantly, provide concrete, detailed feedback on packagequality.com, so that package authors know why their package(s) are scoring that way.

I am a huge fan of automated tests and quality measurements. I so much want package-quality to succeed. Thanks for listening.

alexfernandez commented 8 years ago

Hi @guitarmanvt, I have added a small clarification in commit 9475551. The idea to provide detailed feedback is great, but right now the internals of the algorithm are not exposed. PRs welcome! :)

guitarmanvt commented 8 years ago

Doh! I didn't realize that the ratings image was from XKCD. In that case, who am I to complain? ;)

Thanks for the clarification, @alexfernandez . It does help.

anywhichway commented 8 years ago

I think the perspective on quality would be substantially improved if it included integration of data from CI platforms like Travis or code quality platforms like Codacity or CodeClimate. The degree to which low use packages are penalized at this point coupled with a graphic that says anything below 4 stars in Crap has me choose not to add badges for this system which would help in promulgate. I have had several packages hit the top 10% on NPM for brief periods of time and they continue to get slow but steady growth in use. It seems to me that stars speak for themselves and any star system that has stuff below the top 20 percent of stars being crap is a crap star system, particularly if it does not show what percentage of rated packages are in each star bracket, at least then the viewer could make a judgement for him/her self. Finally, if you are tracking stuff over time you could add a star trending velocity.

anywhichway commented 8 years ago

Here's a challenge ... submit the package quality assessment algorithms to npm as a package themselves and see what the ratings are ... probably crap. But, I don't think what you are trying to do is crap. In fact, I think it is a good idea. That being said, so long as the front end says that lots of things that aren't crap are crap, it is not something I would contribute to other than to make suggestions in this issue log.

alexfernandez commented 8 years ago

Hi @anywhichway, don't be offended by the "crap" thing, it's just a webcomic by xkcd which I thought illustrated the point of star systems. As you say, if the valuations of our system are not good then our work is not worth it. That said, any system is going to have its problems.

Your suggestion of showing the % of packages in each bracket is a good one. We accept PRs of all kind.

anywhichway commented 8 years ago

Understood Alex; however, I think it is probably damaging the growth of your tool.

alexfernandez / package-quality

The rating paradigm #28