zendframework / modules.zendframework.com

Home for ZF2 module distribution
BSD 3-Clause "New" or "Revised" License
190 stars 160 forks source link

Scoring / ranking algorithm #104

Open EvanDotPro opened 11 years ago

EvanDotPro commented 11 years ago

Once we get reviews in place this will be even a lower priority, but would still be a fun feature to implement. This idea comes from http://knpbundles.com/, however I think we can clean up the implementation.

We'll need some collaboration on how the algorithm could work, but I think instead of a "point" system like KNPBundles.com, we should look at more of a PageRank type result, where the score is on a scale of 0 - 10.

FAQ on KNPBundles.com's scoring algorithm. I think they have a very good start on the various metrics that are useful in discovering high quality modules.

Obviously if this were implemented, it would be an optional way of sorting modules, and mostly for fun -- we don't want this to turn into something political or create any unfairness in the modules community. Everyone should really have an equal chance to have their module discovered.

We could also probably start up a conversation with the KNPBundles folks and ask them for any ideas or tips they have for something like this, as I'm sure they've learned a thing or two creating their own implementation!

Hounddog commented 11 years ago

I have done some research and the guys from KNPBundles have their ranking algorythm published. We should propably just take some of those ideas from there. http://knpbundles.com/about/faq-scoring

EvanDotPro commented 10 years ago

I'm all for not re-inventing the wheel. :)

dzhibas commented 10 years ago

Hi,

here is output of very naive score https://gist.github.com/dzhibas/8801726

which is based only on these features: normalized(forks) + normalized(subscribers) + normalized(watchers) + .1 * 1/(1+log(1+days_last_pushed))

features is normalized to scale 0 - 1 so max of this score is 3.1 and min 0 so it can be easely rescaled to score from 1 till 10.

prototype script (just to see how it works, later should be converted to php): https://gist.github.com/dzhibas/8772697

from this test conclusions:

  1. need more features (like travis,readme,package,maybe size of repo) which contributes to score
  2. need weights next to features to promote some of features to contribute more to final score
  3. it should be a cronjob to recalculate scores at some frequency

regards,