niccokunzmann / first_timer_scraper

Find pull-requests and issues of first time contributors
http://firsttimers.quelltext.eu
GNU Affero General Public License v3.0
9 stars 15 forks source link

Use Github API to scrape the commits. #3

Open niccokunzmann opened 7 years ago

niccokunzmann commented 7 years ago

@Abhi2424shek told me this: I saw you would be cloning all the repos, won't that be a overdeal? @niccokunzmann to get the commit description: When a org registers a repo, use the github api to fetch the last 5000 commits then take the last commits timestamp from the api and make another call with parameter since with the value of the timestamp you got earlier. Then capture the next 5000 and repeat this procedure. You wont need to query the commits again, and again from the first you would only need to query the unqueried ones which can be got using the from parameter with the timestamp of the latest commit you queried.

abishekvashok commented 7 years ago

Hey, @niccokunzmann to keep it sensible you need to last few lines also, i think it was:

Github allows you to take only 5000 commits per api token in an hour, so make use of multiple tokens to do the same. Since these tokens would come back to use after an hour they can be reused as well.