penumbra-zone / penumbra

Penumbra is a fully private proof-of-stake network and decentralized exchange for the Cosmos ecosystem.
https://penumbra.zone
Apache License 2.0
381 stars 296 forks source link

ci: docs publishing fails due to docker api limit #4924

Open conorsch opened 4 hours ago

conorsch commented 4 hours ago

The CI job for publishing docs changes is failing:

docker-api-failure

We've encountered this on other jobs before, when interacting with docker hub directly, and the solution was to make authenticated requests instead. Will modify the job the docs job to do the same, which should resolve the issue. Filing this ticket just so it's linkable if and when the problem occurs again.

conorsch commented 4 hours ago

the solution was to make authenticated requests instead.

Not so simple this time: the pull attempt is happening on the container that the firebase-action helper uses, and GitHub CI pulls all containers in a job before running the first step of that job, which is where the ratelimit is triggered. Therefore even if we add a docker login action, it won't run early enough to affect the preparatory image pulls.

Notably the official GHA runners automatically receive a docker login token with much higher rate limits. Since we use BuildJet runners, however, we don't enjoy the same increased rate limits.

Intriguingly there are reports that Docker Hub only recently (i.e. with the past few days) started enforcing IPv6 rate limits, which could explain the sudden change.

I've rerun the job in question and it passed fine. I suspect we'll see these failures periodically, but I'm not taking further action right now, since a lot of folks are already reporting it and debugging accordingly. Will keep an eye on the actions list and report results in here.