dogsheep / github-to-sqlite

Save data from GitHub to a SQLite database
https://github-to-sqlite.dogsheep.net/
Apache License 2.0
403 stars 43 forks source link

Add --merged-by flag to pull-requests sub command #66

Open sarcasticadmin opened 3 years ago

sarcasticadmin commented 3 years ago

Description

Proposing a solution to the API limitation for merged_by in pull_requests. Specifically the following called out in the readme:

Note that the merged_by column on the pull_requests table will only be populated for pull requests that are loaded using the --pull-request option - the GitHub API does not return this field for pull requests that are loaded in bulk.

This approach might cause larger repos to hit rate limits called out in https://github.com/dogsheep/github-to-sqlite/issues/51 but seems to work well in the repos I tested and included below.

Old Behavior

New Behavior

Testing

Picking some repo that has more than one merger (datasette only has 1 😉 )

$ github-to-sqlite pull-requests ./github.db opnsense/tools --merged-by
$ echo "select id, url, merged_by from pull_requests;" | sqlite3 ./github.db 
83533612|https://github.com/opnsense/tools/pull/39|1915288
102632885|https://github.com/opnsense/tools/pull/43|1915288
149114810|https://github.com/opnsense/tools/pull/57|1915288
160394495|https://github.com/opnsense/tools/pull/64|1915288
163308408|https://github.com/opnsense/tools/pull/67|1915288
169723264|https://github.com/opnsense/tools/pull/69|1915288
171381422|https://github.com/opnsense/tools/pull/72|1915288
179938195|https://github.com/opnsense/tools/pull/77|1915288
196233824|https://github.com/opnsense/tools/pull/82|1915288
215289964|https://github.com/opnsense/tools/pull/93|
219696100|https://github.com/opnsense/tools/pull/97|1915288
223664843|https://github.com/opnsense/tools/pull/99|
228446172|https://github.com/opnsense/tools/pull/103|1915288
238930434|https://github.com/opnsense/tools/pull/110|1915288
255507110|https://github.com/opnsense/tools/pull/119|1915288
255980675|https://github.com/opnsense/tools/pull/120|1915288
261906770|https://github.com/opnsense/tools/pull/125|
263800503|https://github.com/opnsense/tools/pull/127|1915288
264038685|https://github.com/opnsense/tools/pull/128|1915288
264696704|https://github.com/opnsense/tools/pull/129|1915288
266660547|https://github.com/opnsense/tools/pull/130|1915288
273120409|https://github.com/opnsense/tools/pull/133|1915288
274370803|https://github.com/opnsense/tools/pull/135|
276600629|https://github.com/opnsense/tools/pull/139|
277303655|https://github.com/opnsense/tools/pull/141|1915288
293033714|https://github.com/opnsense/tools/pull/145|
294827649|https://github.com/opnsense/tools/pull/146|
295140008|https://github.com/opnsense/tools/pull/147|1915288
305690829|https://github.com/opnsense/tools/pull/150|9783985
307077931|https://github.com/opnsense/tools/pull/152|1915288
321782100|https://github.com/opnsense/tools/pull/155|
337265672|https://github.com/opnsense/tools/pull/160|
337267484|https://github.com/opnsense/tools/pull/161|1915288
368251763|https://github.com/opnsense/tools/pull/169|
428262505|https://github.com/opnsense/tools/pull/181|
437557011|https://github.com/opnsense/tools/pull/182|1915288
447079893|https://github.com/opnsense/tools/pull/185|
461822092|https://github.com/opnsense/tools/pull/191|
463290142|https://github.com/opnsense/tools/pull/193|1915288
470112962|https://github.com/opnsense/tools/pull/194|1915288
472644649|https://github.com/opnsense/tools/pull/195|1915288
488696898|https://github.com/opnsense/tools/pull/198|
513289902|https://github.com/opnsense/tools/pull/201|
522530265|https://github.com/opnsense/tools/pull/203|
564443347|https://github.com/opnsense/tools/pull/213|
597579516|https://github.com/opnsense/tools/pull/220|1915288
602860357|https://github.com/opnsense/tools/pull/221|1915288
608744738|https://github.com/opnsense/tools/pull/222|1915288
623279673|https://github.com/opnsense/tools/pull/228|1915288
664656182|https://github.com/opnsense/tools/pull/233|
664781786|https://github.com/opnsense/tools/pull/234|1915288
670683636|https://github.com/opnsense/tools/pull/235|1915288
683150764|https://github.com/opnsense/tools/pull/237|
685016233|https://github.com/opnsense/tools/pull/238|
687099825|https://github.com/opnsense/tools/pull/239|1915288
715705652|https://github.com/opnsense/tools/pull/244|1915288
715721248|https://github.com/opnsense/tools/pull/245|1915288

userid are now present for those PRs that were merged.

Without the flag the merged_by behavior remains missing as expected when get PRs bulk:

$ github-to-sqlite pull-requests ./github.db opnsense/tools
$ echo "select id, url, merged_by from pull_requests;" | sqlite3 ./github.db 
83533612|https://github.com/opnsense/tools/pull/39|
102632885|https://github.com/opnsense/tools/pull/43|
149114810|https://github.com/opnsense/tools/pull/57|
160394495|https://github.com/opnsense/tools/pull/64|
163308408|https://github.com/opnsense/tools/pull/67|
169723264|https://github.com/opnsense/tools/pull/69|
171381422|https://github.com/opnsense/tools/pull/72|
179938195|https://github.com/opnsense/tools/pull/77|
196233824|https://github.com/opnsense/tools/pull/82|
215289964|https://github.com/opnsense/tools/pull/93|
219696100|https://github.com/opnsense/tools/pull/97|
223664843|https://github.com/opnsense/tools/pull/99|
228446172|https://github.com/opnsense/tools/pull/103|
238930434|https://github.com/opnsense/tools/pull/110|
255507110|https://github.com/opnsense/tools/pull/119|
255980675|https://github.com/opnsense/tools/pull/120|
261906770|https://github.com/opnsense/tools/pull/125|
263800503|https://github.com/opnsense/tools/pull/127|
264038685|https://github.com/opnsense/tools/pull/128|
264696704|https://github.com/opnsense/tools/pull/129|
266660547|https://github.com/opnsense/tools/pull/130|
273120409|https://github.com/opnsense/tools/pull/133|
274370803|https://github.com/opnsense/tools/pull/135|
276600629|https://github.com/opnsense/tools/pull/139|
277303655|https://github.com/opnsense/tools/pull/141|
293033714|https://github.com/opnsense/tools/pull/145|
294827649|https://github.com/opnsense/tools/pull/146|
295140008|https://github.com/opnsense/tools/pull/147|
305690829|https://github.com/opnsense/tools/pull/150|
307077931|https://github.com/opnsense/tools/pull/152|
321782100|https://github.com/opnsense/tools/pull/155|
337265672|https://github.com/opnsense/tools/pull/160|
337267484|https://github.com/opnsense/tools/pull/161|
368251763|https://github.com/opnsense/tools/pull/169|
428262505|https://github.com/opnsense/tools/pull/181|
437557011|https://github.com/opnsense/tools/pull/182|
447079893|https://github.com/opnsense/tools/pull/185|
461822092|https://github.com/opnsense/tools/pull/191|
463290142|https://github.com/opnsense/tools/pull/193|
470112962|https://github.com/opnsense/tools/pull/194|
472644649|https://github.com/opnsense/tools/pull/195|
488696898|https://github.com/opnsense/tools/pull/198|
513289902|https://github.com/opnsense/tools/pull/201|
522530265|https://github.com/opnsense/tools/pull/203|
564443347|https://github.com/opnsense/tools/pull/213|
597579516|https://github.com/opnsense/tools/pull/220|
602860357|https://github.com/opnsense/tools/pull/221|
608744738|https://github.com/opnsense/tools/pull/222|
623279673|https://github.com/opnsense/tools/pull/228|
664656182|https://github.com/opnsense/tools/pull/233|
664781786|https://github.com/opnsense/tools/pull/234|
670683636|https://github.com/opnsense/tools/pull/235|
683150764|https://github.com/opnsense/tools/pull/237|
685016233|https://github.com/opnsense/tools/pull/238|
687099825|https://github.com/opnsense/tools/pull/239|
715705652|https://github.com/opnsense/tools/pull/244|
715721248|https://github.com/opnsense/tools/pull/245|

Individual PRs passed via --pull-request flag behaves as expected (unchanged):

$ github-to-sqlite pull-requests ./github.db opnsense/tools --pull-request 39 --pull-request 237
$ echo "select id, url, merged_by from pull_requests;" | sqlite3 ./github.db
83533612|https://github.com/opnsense/tools/pull/39|1915288
683150764|https://github.com/opnsense/tools/pull/237|

Picking 1 PR that has a merged_by (39) and one that does not (237)

sarcasticadmin commented 2 years ago

@simonw any feedback/thoughts?