dotnet / arcade-services

Arcade Engineering Services
MIT License
60 stars 74 forks source link

Adding build to channel silently fails on GitHub API rate limiting #3424

Closed premun closed 3 weeks ago

premun commented 7 months ago

There was the following report from the MSBuild team:

Last week we faced the problem with publishing symbols for this pipeline: https://tfsprodwus2su6.visualstudio.com/A011b8bdf-6d56-4f87-be0d-0092136884d9/DevDiv/_build/results?buildId=9213592&view=logs&j=226748d0-f812-5437-d3f0-2dd291f5666e&t=bad11196-972e-5d03-45a8-9db526506073

Darc reported a message "Build '216289' is already on all target channel(s).", but it wan't true and we had to do that manually later.

They found out about the build not being in the channel like this:

image

dkurepa commented 7 months ago

Here's the reason why this happened:

dkurepa commented 7 months ago

@premun suggested we start using authenticated API calls here, since they have a higher rate limit of 5000 per hour, instead of 60, which we're getting now with our unauthenticated ones

mmitche commented 7 months ago

Yeah, we could do that. We would need to plumb through another token. The token that is used to build the dependency graph must not be used to check the commit. Another option would be to just have the task handle the rate limiting. The response contains how long to wait for the reset to happen.

I think that handling 429 gracefully may be better. The task is used in more places than just the arcade YAML templates, so plumbing through another token is non-trivial. I also like the safety net that avoiding any authentication has. An improperly scoped token can't accidentally expose a private repo's builds.

dkurepa commented 3 weeks ago

We made the code go red if this happens, and added instructions in this PR https://github.com/dotnet/arcade-services/pull/3428