paulodiovani / hacktoberrank

Hacktoberfest Rank
https://hacktoberrank-challenge.herokuapp.com/
MIT License
6 stars 17 forks source link

Fetch and list github pull requests #4

Closed paulodiovani closed 4 years ago

paulodiovani commented 4 years ago

Use Github Search API (https://developer.github.com/v3/search/#search-issues-and-pull-requests) to fetch pull requests in the selected period.

The search should filter by date, from October 1st to October 31 for the selected year (it can be fixed to 2019 for start) and group by user. Then sort the results in descending order (more contrubutions to less).

Note that github api has a query limit. Try to get the results in a single request.

Details for searching issues and pull request can be found at https://help.github.com/en/articles/searching-issues-and-pull-requests

arabyalhomsi commented 4 years ago

I claim this one :)

arabyalhomsi commented 4 years ago

Not sure if you can search github without a search query. Meaning that I am not able to retrieve only recent or (within a specific month) pull requests, I would have to have a search query. @paulodiovani

paulodiovani commented 4 years ago

You can, we use a similar approach in an internal project at Codeminer to search to collaborators PRs, let me get an example...

arabyalhomsi commented 4 years ago

I think I figured it, this works is:pr created:2019-10-01..2019-10-31

paulodiovani commented 4 years ago

Exaclty. :) Here is an example:

curl https://api.github.com/search/issues\?q\=type:pr+created:2019-10-01..2019-10-31
arabyalhomsi commented 4 years ago

Github API allows you to only recieve 100 pull requests per page. There is about 654527 pull requests that took place since the beginning of this month. And you can only make 30 requests per minute maximum. This means that I can only retrieve 3000 pull requests per minute and therefore it is not possible to go through all of them and group them by user.

I am trying to find another approach, hope you have suggestions! @paulodiovani

paulodiovani commented 4 years ago

Yes, I am aware of this limit. That's part of the reason we have issue #7.

I wouldn't worry so much for now...

The intention of this first version (no database / no cache) is make a proof of concept and have some data to allow front-end development, so it is not a problem to have just the starting 100 PRs.

Once we have the cache/db in place, we can trigger the update just often (I suggested once an hour, but we could probably have it run only a couple times a day) and at the end of october we don't need to update at all (until next year).

Other options would be to split this period in smaller ones (fetch 2 or 3 days per request only, for example).

I am trying to find another approach

I'm curious, let me know your idea. :)

paulodiovani commented 4 years ago

Pull request #26 partially solves this issue, but we still need to list the results in front-end.

For now a simple html list will do, styles will be improved in #8

arabyalhomsi commented 4 years ago

trying to make a simple list

arabyalhomsi commented 4 years ago

https://github.com/paulodiovani/hacktoberrank/pull/29

paulodiovani commented 4 years ago

Done!