Github API Bot which makes PRs to Ignore Common Files
Well, dang it
the GitHub Search API provides up to 1,000 results for each search.
Meaning, of the 500,000 results, we can see at most 3,000 based on the different sorting methods.
Converting cURL to Ruby
Example Query for .DS_Store
API Docs for Searching
API Docs for Search Query Syntax
API Docs for Search Query on Code Syntax
Required Files:
Database: ignore_bot
Table: repos
Authentication
Hourly
500,000 Repos with
.DS_Store
files
API Requests
List 100 per Requests: 5,000 Requests to list all
Fork Repos: 500,000 Requests
Edit Files: 500,000 Requests
Create PRs: 500,000 Requests
Delete Forks: 500,000 Requests
Total Requests: ~2,000,000
With 100 Users @ 5000 Requests Each (500,000 Requests/Hour) it would take 4 Hours
With 1 User @ 5000 Requests it would take 17 Days
Network Traffic
List Repos (per 100): 100 B up, 500 KB down = 500 KB 5,000 = 2500 MB
Fork Repos: 100 B 500,000 = 50 MB
Edit Files: 100 B 500,000 = 50 MB
Create PRs: 100 B 500,000 = 50 MB
Delete Forks: 100 B * 500,000 = 50 MB
Total Traffic: 2700 MB
With a network speed of 1 MB/s, it would take 45 Minutes
100 B comes from approximation of each request sent.
5 MB from each request comes from counting 1 result as 5 KB times 100 results per page