Enable experimentation - Githubissues

jamestalmage commented 8 years ago

I think it would be very interesting to develop a system where it was easy to run some "what if..." tweaks to the algorithm and see how they impacted results.

Also, develop some reporting tools that allowed some analysis of how well the algorithm is working; Are there modules that rank high with popularity, personality, and quality, but suffer on the maintenance score? That might help us discover problems with the maintenance scoring algorithm.

satazor commented 8 years ago

I think it would be very interesting to develop a system where it was easy to run some "what if..." tweaks to the algorithm and see how they impacted results.

Thats a good idea. How do you suggest such system? Acceptance tests or something more complex?

Also, develop some reporting tools that allowed some analysis of how well the algorithm is working; Are there modules that rank high with popularity, personality, and quality, but suffer on the maintenance score? That might help us discover problems with the maintenance scoring algorithm.

We could do that in kibana, I got it setup for logging but we can experiment with it to draw some graphics based on the data.

jamestalmage commented 8 years ago

How do you suggest such system? Acceptance tests or something more complex?

I really don't know. I just see issues like #29 as being really hard to quantify (and the fixes laden with potential for unintended consequences).

We could do that in kibana, I got it setup for logging but we can experiment with it to draw some graphics based on the data.

Graphics are good. It would be nice to dive into that data as well. In my above example (maintenance has low score, but all other scores are high), you've got a couple potential causes:

The module has been abandoned (low maintenance score is deserved)
The module is finished / locked but the maintainer is weary of closing the same feature request issue over and over - so now just leaves those open for a long time (low maintenance score is undeserved).
There is something wrong with the maintenance scoring algorithm (i.e. project is so large an popular, it's just not possible to get to some issues for a very long time).

It would be nice to create a page that showed exactly how a module was scored (what helped, what hurt), and once you think you have found a pattern where modules are being mis-ranked, search for other modules that are similarly impacted. This sort of tool could help justify changes to the algorithm.

Similarly, it would be interesting to compare how a pending algorithm change would affect current rankings. Maybe create a "winners" and "losers" list.

Probably the best thing to do would be make it data driven - see which modules users click, and what rank it was on the page at the time - then compare algorithm tweaks to see if it would, on average, move the desired results of historical searches higher up the page.

npms-io / npms-analyzer

Enable experimentation #53