nodiscc / hecat

Generic automation tool around data stored as plaintext YAML files
GNU General Public License v3.0
29 stars 5 forks source link

processors/github_metadata: allow sleeping for a configurable amount of time period before each Github API call #97

Closed nodiscc closed 1 year ago

nodiscc commented 1 year ago

https://docs.github.com/en/rest/overview/resources-in-the-rest-api?apiVersion=2022-11-28#rate-limits-for-requests-from-github-actions

When using GITHUB_TOKEN, the rate limit is 1,000 requests per hour per repository.

Each run of get_gh_metadata() causes 2 API calls (on for the repo object, and one for the commits list). Hence the API rate limit will trigger if the input list contains more than ~500 Github repositories in fact a bit less, since API calls for authentication are also counted).

For example in https://github.com/awesome-selfhosted/awesome-selfhosted-data/actions/runs/4661828746/jobs/8267454945, retrieving data from the API fails after 493 repositories. The input list contains 1029 Github repositories (->~2058 API requests), so we should wait for ~7.2 seconds before each call to not exceed the rate limit.

Another option is to use a personal access token, or create a Github App