RepoReapers / reaper

Calculate the score of a repository based on best engineering practices.
https://reporeapers.github.io/
Apache License 2.0
107 stars 23 forks source link

Remove Dependency on GHTorrent #20

Open nuthanmunaiah opened 4 years ago

nuthanmunaiah commented 4 years ago

Description

reaper requires the GHTorrent database be restored to a MySQL/MariaDB instance. The requirement to have the full GHTorrent database restored before running reaper is prohibitively time intensive (the GHTorrent database dump from 2019-06-01 is over 100 GB in size). The removal of dependency on GHTorrent will require reaper to mine GitHub for the repository data and metadata that has already been mined by the GHTorrent project. On the other hand, there will be no need to restore repository data and metadata for several million repositories while all the user wants to do is analyze a few.

JEIEJEE commented 4 years ago

i think it is convenient for outlier