BetaHuhn / repo-file-sync-action

πŸ”„ GitHub Action to keep files like Action workflows or entire directories in sync between multiple repositories.
https://github.com/marketplace/actions/repo-file-sync-action
MIT License
288 stars 83 forks source link

Rate limit exceeded during pull-request #126

Closed arizonaherbaltea closed 2 years ago

arizonaherbaltea commented 3 years ago

🐞 Describe the bug

Appears that when attempting to sync many repos github may throttle the number of pull-requests that can be made and will result in only partially completing the full list.

...
Locally syncing file(s) between source and target repository
Source is directory
Pushing changes to target repository
Creating new PR
Error: You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later.
Error: HttpError: You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later.
Repository Info
Slug        : client-config
Owner       : acme-corp
Https Url   : https://github.com/acme-corp/client-config
Branch      : master

Locally syncing file(s) between source and target repository
Source is directory
Pushing changes to target repository
Creating new PR
Error: You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later.
Error: HttpError: You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later.
Repository Info
Slug        : app-config
Owner       : acme-corp
Https Url   : https://github.com/acme-corp/app-config
Branch      : master

Locally syncing file(s) between source and target repository
Source is directory
Pushing changes to target repository
Creating new PR
Error: You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later.
Error: HttpError: You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later.
Repository Info
Slug        : api-config
Owner       : acme-corp
Https Url   : https://github.com/acme-corp/api-config
Branch      : master
...
...

πŸ“š To Reproduce

Perform many syncs within one of the limits. Could be x per hour or x per minute.

πŸ’‘ Expected behavior

Either fast-fail and exit non-zero or retry after 5-10 seconds as per some (additional) action input. The action shouldn't indicate that it was successful if it didn't actually perform all the updates in the config (sync.yml).

πŸ–ΌοΈ Screenshots

image

notice the matrix shows 12 jobs, but my sync.yml has 20 repos listed.

image

βš™οΈ Environment

πŸ“‹ Additional context

Add any other context about the problem here.

BetaHuhn commented 2 years ago

We had this problem a while ago (#49) and we changed the action to use GitHub's plugin-throttling.js to throttle API requests based on the recommended request throttling best practices.

Maybe we have to tune its parameters a bit more.

Regarding the action not failing, see #95

arizonaherbaltea commented 2 years ago

I'm not any good at javascript but it doesn't seem like automatic retries are occurring because package.json doesn't include the retry plugin and therefore no dependency imports. https://github.com/BetaHuhn/repo-file-sync-action/blob/master/package.json#L32

    "@actions/core": "^1.6.0",
    "@actions/github": "^5.0.0",
    "@octokit/plugin-throttling": "^3.5.2",
    "@putout/git-status-porcelain": "^1.1.0",

https://octokit.github.io/rest.js/v18#throttling

Automatic retries
Many common request errors can be easily remediated by retrying the request. We recommend installing the @octokit/plugin-retry plugin for Automatic retries in these cases

Pertaining to the handling of failure, I'd still contend that the last line of code should be the exit status, which exits the process non-zero if any syncs fail to fulfill. I view the index.js like a simple while loop with a try catch to ensure each repo file sync has a chance to occur. It should be simple enough to add a count or array of all the failed syncs and either exit non-zero at the end or add an action output so I can determine if I want the github workflow to show green (success) or red (failure). Github actions provides an underlying context for continuing on a failed step and then processing the details of the failure.

jobs:
  example-job:
    steps:
      - name: Repo File Sync Action
        uses: BetaHuhn/repo-file-sync-action@v1
        id: sync
        continue-on-error: true

      - name: Check sync results
        env:
          SYNC_OUTCOME: ${{ steps.sync.outcome }}
        run: |
          if [[ "SYNC_OUTCOME" != 'success' ]]; then
            echo "Not all repo files syncs successfully occurred. Check the previous action step for more details."
            exit 1
          fi

My main concern stems from the fact that I'd technically have to parse sync.yml for all repos and figure out what the total count might be and then compare that with the number of outputted pull-request urls to determine if the action completed successfully or not, when it should be the responsibility of the action to handle that properly.


On another note, I really like this action and I'd try to fix it myself but don't know enough about JS to make it happen. πŸ’˜

BetaHuhn commented 2 years ago

plugin-retry.js won't solve this issue as it won't retry requests that fail with the 403 error code (see in the README of the plugin), which is what we get when hitting the rate limit. plugin-throttling.js doesn't need other plugins to retry requests. The two plugins you mentioned are separate and solve different unrelated problems.

Maybe our problem is related to https://github.com/octokit/plugin-throttling.js/issues/439#issuecomment-918186721. Since we added #49 I haven't had any issues with rate limits until now. Maybe the plugin is just not triggered correctly.

Re the failure, I completely agree with you. The action should set a none zero exit code when it failed to sync to at least one of the repositories. As you can read in the mentioned issue (https://github.com/BetaHuhn/repo-file-sync-action/discussions/95#discussioncomment-1277043), I just didn't have the time to implement it yet. It should be quite simple to add, feel free to give it a try :)

On another note, I really like this action and I'd try to fix it myself but don't know enough about JS to make it happen. πŸ’˜

No problem! I will investigate the rate limit issue if I have time this weekend.

samschurter commented 2 years ago

Plus one for this, I just added the action and configured it to sync 12 repos. It succeeded on 10 and failed on the last 2.

image

raeganbarker commented 2 years ago

I've just started encountering the same issue as well. The rate limit error seems to occur after the 10th repo in a sync.

p.s. Love this action. It's saved me so much time and manual effort!

BetaHuhn commented 2 years ago

Just for clarity, did it work before with >10 repos and only now you encounter the rate limit @raeganbarker?

p.s. Love this action. It's saved me so much time and manual effort!

Thanks! Glad you like it :)

cansavvy commented 2 years ago

I'm also encountering this error and I have an interest in getting around it! I think mine worked on 7 repositories before failing. But also I think my trigger may have been too lenient, the sync got triggered 3 times it seemed? https://github.com/jhudsl/DaSL_Course_Template_Bookdown/actions

I don't know JS but I'm happy to help anyway I can.

BetaHuhn commented 2 years ago

I still think the problem is that plugin-throttling.js is not being triggered. I will have to see how I can verify this.

The last change (https://github.com/BetaHuhn/repo-file-sync-action/commit/3007ff6ce457682c67154078f0a0e86892d9423b) to the plugin was a version bump with https://github.com/BetaHuhn/repo-file-sync-action/pull/105.

I will try to debug if that version changed something which could be a cause for the plugin to not trigger.

In the meantime, please share any logs of the rate limit occurring. I don't have 10 repositories to test this with :)

BetaHuhn commented 2 years ago

Looking at the code where the plugin is used, it should log a warning when hitting the limit:

https://github.com/BetaHuhn/repo-file-sync-action/blob/57f21592446fdcf96d38ceca9506b471670cfe5b/src/git.js#L31

I can't see that being logged in any of your logs.

This means the plugin is either not triggered because it fails somewhere or the rate limit is hit outside of the plugins control i.e. the octokit instance.

While looking around for similar issues I discovered https://github.com/peter-evans/create-pull-request/issues/855#issuecomment-852565210, this might be our issue as well. The plugin can't control the rate of git push requests, maybe they are hitting a rate limit at GitHub. Will have to investigate if this actually the problem.

BetaHuhn commented 2 years ago

Okay I can now reproduce this issue: https://github.com/BetaHuhn/gh-repo-automation/runs/4096146204?check_suite_focus=true#step:3:326

cansavvy commented 2 years ago

I can't see that being logged in any of your logs.

If you are looking at the most recent actions, I returned to an earlier version of BetaHuhn/repo-file-sync-action so that I could get this release out -- the earlier version, BetaHuhn/repo-file-sync-action@v1.7.1 which works (it is after #49 is incorporated, so that's why I chose that one).

This one shows the message that others have noted on this thread. https://github.com/jhudsl/DaSL_Course_Template_Bookdown/actions/runs/1417570063

So the fix you implemented here: https://github.com/betahuhn/repo-file-sync-action/commit/cee8520 worked, so if there's a way for you to incorporate it back in to the current version.

BetaHuhn commented 2 years ago

So the fix you implemented here: cee8520 worked, so if there's a way for you to incorporate it back in to the current version.

That's the thing, it is still there and wasn't changed since it was added. The only change was that the plugin used was bumped from 3.5.1 to 3.5.2.

I will revert that change and see if it works after that. The version before that change is v1.13.1.

cansavvy commented 2 years ago

Is it as simple as updating this part or something related to this change?: https://github.com/BetaHuhn/repo-file-sync-action/blob/8c50c0d781d6ef22a5b395616897adda0cb6d0b8/dist/index.js#L17008

Looks like octokit/plugin-throttling.js only changes are just having to do with changing abuse -> secondary rate See: https://github.com/octokit/plugin-throttling.js/commit/6c361d45f07b8adf5fda11340ffaf3aee8a1ecfb

I don't know any JS, just looking at where these things are referenced.

BetaHuhn commented 2 years ago

They changed abuse to secondary rate internally as GitHub changed the response, but the onAbuseLimit method stayed the same.

I tried a few older versions of this action and v1.8.0 also fails:

https://github.com/BetaHuhn/gh-repo-automation/runs/4096456566?check_suite_focus=true#step:3:381

You mentioned v1.7.1 works, so the issue must have been introduced with v1.8.0.

To verify this, could you repeat the same run with v1.7.1 and v1.8.0 (only changing the version) @cansavvy ?

cansavvy commented 2 years ago

Yes, v1.8.0 works too: https://github.com/jhudsl/DaSL_Course_Template_Bookdown/runs/4096640405?check_suite_focus=true

BetaHuhn commented 2 years ago

Then I am completely puzzled as v1.8.0 fails for me 😭

I am now pretty sure that the plugin doesn't trigger for some weird reason. I will continue to see what is causing this.

arizonaherbaltea commented 2 years ago

Yes, v1.8.0 works too: https://github.com/jhudsl/DaSL_Course_Template_Bookdown/runs/4096640405?check_suite_focus=true

Just to clarify, It doesn't fail on "no changes" and "overwriting pr" as this doesn't invoke the "new pr" api. Try removing all existing sync prs, make some minor (non-impactful) change to the source files, and then init the sync.

cansavvy commented 2 years ago

Just to clarify, It doesn't fail on "no changes" and "overwriting pr" as this doesn't invoke the "new pr" api. Try removing all existing sync prs, make some minor (non-impactful) change to the source files, and then init the sync.

OH I see. Well then, unfortunately I won't be able to test this for you just yet then. Once those PRs are taken care of by the respective owners of those repos, I can try again later. Sorry.

BetaHuhn commented 2 years ago

Ah yeah, thanks @herbaltealeaf, I missed that when looking through the logs. That explains why it worked for @cansavvy but not for me.

@cansavvy, I don't think you need to rerun it, we can assume it doesn't work with v1.8.0

BetaHuhn commented 2 years ago

So just to summarize my findings:

That is where I am currently at. If anyone has any ideas, feel free to share them :)

arizonaherbaltea commented 2 years ago

I'll be happy to test it for you, however I can't share the logs. So here's a github action workflow I've cooked up that should provide some additional insight into things.

I don't know how js is really constructed, but this is what I'd do in bash: add another try catch and implement my own back-off and retry pattern for creating a pull-request as increasing intervals like 1: 3s, 2: 5s, 3: 10secs, ect, max: 5 attempts, collect all the errored out sync attempts as a separate github action output, and ideally exit non-zero. This assumes that limits can be detected through api response message/headers.

A terrible, but possibly effective solution would be to implement a wait after creating a new pull-request of 2-5 seconds. That's a super hacky idea, just throwing it out there.

You'll need to create a github personal action token that has access to each sync repo and store as a repo secret ACTIONS_WORKFLOW_PAT in the repo containing the files to distribute. Adjust SYNC_CONFIG_PATH to point to your config file. PR_AUTO_MERGE will enable you to fully clean up ( by merging ) the pull-requests for better REPL. Lastly, of course change uses: BetaHuhn/repo-file-sync-action@master as appropriate.

.github/workflows/push-config.yaml ``` yaml name: Push Githubflow-Helm-Config Workflow on: workflow_dispatch: env: PR_AUTO_MERGE: 'true' SYNC_CONFIG_PATH: .github/githubflow-helm-config-workflow-sync.yml jobs: githubflow-helm-config-workflow-sync: runs-on: ubuntu-latest steps: - name: Checkout Github Workflows Sync Repository uses: actions/checkout@v2 - name: Sync Github Workflows uses: BetaHuhn/repo-file-sync-action@master id: sync with: GH_PAT: ${{ secrets.ACTIONS_WORKFLOW_PAT }} BRANCH_PREFIX: pushed_workflow CONFIG_PATH: ${{ env.SYNC_CONFIG_PATH }} COMMIT_PREFIX: '[skip ci]' GIT_USERNAME: "github-action" GIT_EMAIL: 'action@github.com' PR_LABELS: push-workflow - name: Evaluate PR Conditions id: eval env: PR_URLS: ${{ steps.sync.outputs.pull_request_urls }} run: | echo "::group::pull requests opened" echo "$PR_URLS" echo "::endgroup::" num_prs_opened=$( echo "$PR_URLS" | jq -rc '[.[]|select(. != null)] | length' ) if [[ $num_prs_opened > 0 ]]; then echo "::set-output name=prs_created::true" fi - name: Improve PR Matrix Readability id: read if: steps.eval.outputs.prs_created == 'true' env: PR_URLS: ${{ steps.sync.outputs.pull_request_urls }} run: | PR_ARRAY=$( while read -r url; do repo_owner="$( echo "${url}" | cut -d '/' -f 4 )" repo_name="$( echo "${url}" | cut -d '/' -f 5 )" pr_number="$( echo "${url}" | rev | cut -d '/' -f 1 | rev )" echo '{}' | jq -rc "{repo: \"$repo_name\", owner: \"$repo_owner\", pr: \"$pr_number\"}" done < <(echo "$PR_URLS" | jq -r '.[]') \ | jq -src '.|sort' ) echo "::group::json array" echo "$PR_ARRAY" echo "::set-output name=pr_list::${PR_ARRAY}" echo "::endgroup::" outputs: matrix: ${{ steps.sync.outputs.pr_list }} pr_auto_merge: ${{ env.PR_AUTO_MERGE }} prs_available: ${{ steps.eval.outputs.prs_created }} pr: runs-on: ubuntu-latest needs: [githubflow-helm-config-workflow-sync] if: needs.githubflow-helm-config-workflow-sync.outputs.pr_auto_merge == 'true' && needs.githubflow-helm-config-workflow-sync.outputs.prs_available == 'true' strategy: fail-fast: false max-parallel: 10 matrix: pull_request: ${{ fromJson(needs.githubflow-helm-config-workflow-sync.outputs.matrix) }} steps: - name: Parse Pull-Request Details id: repo_details env: PULL_REQUEST: ${{ matrix.pull_request }} run: | repo_owner="$( echo "${PULL_REQUEST}" | jq '.owner' )" repo_name="$( echo "${PULL_REQUEST}" | jq '.repo' )" pr_number="$( echo "${PULL_REQUEST}" | jq '.pr' )" if [[ -z "$repo_owner" ]]; then echo "error: unable to determine the repo owner from '$PULL_REQUEST'. Format must be: {"repo":"api-provider-config","owner":"acme-corp","pr":"11"}" exit 1 fi if [[ -z "$repo_name" ]]; then echo "error: unable to determine the repo name from '$PULL_REQUEST'. Format must be: https://github.com///pull/" exit 1 fi if [[ -z "$pr_number" ]]; then echo "error: unable to determine the pull-request number from '$PULL_REQUEST'. Format must be: https://github.com///pull/" exit 1 fi echo "name=repo_owner::$repo_owner" echo "name=repo_name::$repo_name" echo "name=pr_number::$pr_number" echo "::set-output name=repo_owner::$repo_owner" echo "::set-output name=repo_name::$repo_name" echo "::set-output name=pr_number::$pr_number" - name: Merge Pull Request id: merge_pr uses: "actions/github-script@v2" with: github-token: ${{ secrets.ACTIONS_WORKFLOW_PAT }} script: | const pullRequest = context.payload.pull_request const repository = context.repo await github.pulls.merge({ merge_method: "merge", owner: "${{ steps.repo_details.outputs.repo_owner }}", repo: "${{ steps.repo_details.outputs.repo_name }}", pull_number: ${{ steps.repo_details.outputs.pr_number }} }) result-encoding: string - name: Get Pull Request Details Post-Merge id: get_pr uses: "actions/github-script@v2" with: github-token: ${{ secrets.ACTIONS_WORKFLOW_PAT }} script: | const { data: pullRequest } = await github.pulls.get({ owner: "${{ steps.repo_details.outputs.repo_owner }}", repo: "${{ steps.repo_details.outputs.repo_name }}", pull_number: ${{ steps.repo_details.outputs.pr_number }} }); console.log(JSON.stringify(pullRequest, undefined, 2)); return JSON.stringify(pullRequest, undefined, 2) result-encoding: string - name: Parse Pull Request Result Post-Merge id: parse_pr env: PR_BODY: '${{ steps.get_pr.outputs.result }}' run: | MERGE_COMMIT_SHA="$( echo "${PR_BODY}" | jq -rc '."merge_commit_sha" | select( . != null )' 2>/dev/null || echo '' )" if [[ -n "$MERGE_COMMIT_SHA" ]]; then echo "name=merge-commit-sha::$MERGE_COMMIT_SHA" echo "::set-output name=merge-commit-sha::$MERGE_COMMIT_SHA" else echo "error: unable to parse pull-request json body to get the merge commit sha. pull-request merge doesn't appear to have been successful." exit 1 fi ```
BetaHuhn commented 2 years ago

Finally had some time today to further debug this issue. Good news: I found a solution which appears to work!

In the end it was quite simple, I just needed to install @octokit/core as dev dependency for plugin-throttling.js to work properly...

The fix is live on the fix/rate-limit branch and can be used like this:

uses: BetaHuhn/repo-file-sync-action@fix/rate-limit

If the action hits the rate limit it will log a warning and try again after a few seconds.

Please let me know if it works and I will merge this into master next week.

raeganbarker commented 2 years ago

This looks good on my side! I re-tested with a sync to 11 repos. The sync for the 10th repo initially ran into the GitHub API rate limit and then was handled successfully with the wait/retry logic.

raeganbarker commented 2 years ago

I noticed one thing. In the Job Summary Annotations section, the annotation doesn't indicate the repo sync succeeded or include the URL to the PR for the retried repo sync. Hit secondary GitHub API rate limit, retrying after 5s image All the other annotations include the repo name + link to the PR created.

BetaHuhn commented 2 years ago

Thanks for testing it @raeganbarker!

I will take a look at the annotations issue you mentioned and release the rate limit fix to master tomorrow πŸ‘

BetaHuhn commented 2 years ago

I just verified your issue and there seems to be a limit to 10 notices per action run being shown in the GitHub UI, that's why some of the syncs didn't show up for you @raeganbarker. Not much we can do about that.

I changed the rate limit warning to a debug log as I don't think it needs to create a notice each time the rate limit is hit and the action retries. This will happen quite often and doesn't need to be shown as a warning as it is the intended behavior.

BetaHuhn commented 2 years ago

This issue has been resolved in version 1.15.2 :tada:

To use the latest version, use the Action with the v1 or latest tag:

uses: BetaHuhn/repo-file-sync-action@v1

Sorry this took so long to fix! If you depend on this action or use it in a commercial project, maybe consider sponsoring me on GitHub or supporting me via platforms like Ko-Fi. This keeps me motivated and allows me to spend more time on projects like this, which ultimately benefits you as a user of this action. Thanks!