mona-actions / gh-repo-stats

GH CLI extension to pull statistics on repository metadata used in GitHub migrations
MIT License
146 stars 76 forks source link

Improving efficiency of GitHub API utilization and handling of rate limiting condition #31

Open andyfeller opened 2 years ago

andyfeller commented 2 years ago

Overview

I'm working with a customer leveraging the gh-repo-stats extension as they find the final evaluation of whether there might be issues as helpful for migration planning. Thank you for building and maturing this extension! 🎉 🙇

@saharora and I ran into an issue after ~67 repositories were assessed. In a rough estimation, 100 API calls are being made for each repository, assuming no other activity against the PAT was going on at the time. I need to look at the underlying code again to see how much of the information is coming from GraphQL or not to see if there might be room for optimization.

(env) % gh repo-stats --org XXXXXXXX --repo-page-size 10

######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################

------------------------------------------------------
Please create a GitHub Personal Access Token used to gather
information from your Organization, with a scope of 'repo',
followed by [ENTER]:
(note: your input will NOT be displayed)
Creating file header...
------------------------------------------------------
Getting repositories for org: XXXXXXXXXXX
[5000] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[4048] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Error getting more Pull Requests for Repo: XXXXXXXXXX
{
   "data": null,
   "errors":[
      {
         "message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `267E:58B0:29D30:361FB:62B205EF` when reporting this issue."
      }
   ]
}
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[3615] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[3390] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[3165] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[2525] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[1468] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
ERROR --- Errors occurred while retrieving pull requests for repo: XXXXXXXXXX
{
  "type": "RATE_LIMITED",
  "message": "API rate limit exceeded for user ID 60168593."
}
jq: error (at <stdin>:0): Cannot iterate over null (null)

######################################################
ERROR! Failed response back from GitHub!
Please validate your PAT, Organization, and access levels!
######################################################

Ask

  1. Ideas on optimizing GitHub API utilization for customers with hundreds of repositories to assess
  2. Ideas on improving the error handling around rate limiting
  3. Ideas on what permissions might be needed if running this extension using a GitHub App generated PAT via https://github.com/Link-/gh-token was needed
andyfeller commented 2 years ago

Verifying that the attempts remaining notice is specifically GraphQL limit only 👍

From CheckAPILimit() function

CheckAPILimit() {
  ##############################################################
  # Check what is remaining, and if 0, we need to sleep it off #
  ##############################################################
  API_REMAINING_REQUEST=$(curl -s -X GET \
    --url "${GITHUB_URL}/rate_limit" \
    -H "Authorization: Bearer ${GITHUB_TOKEN}")

  Debug "DEBUG --- API REMAINING DATA BLOCK:"
  DebugJQ "${API_REMAINING_REQUEST}"

  API_REMAINING_MESSAGE=$(echo "${API_REMAINING_REQUEST}" \
    | jq -r '.message' 2>&1)

  if [[ "${API_REMAINING_MESSAGE}" != "Rate limiting is not enabled." ]]; then
    API_REMAINING=$(echo "${API_REMAINING_REQUEST}" \
      | jq -r '.resources.graphql.remaining' 2>&1);
  else
    API_REMAINING=9999999999
  fi