Closed BrightRan closed 3 years ago
Hello @BrightRan , thank you for raising it.
The new policy will be applied starting from November 1st, 2020. It is the reason why we can't reproduce it right now.
Docker will gradually introduce these rate limits, with full effects starting from November 1st, 2020.
GitHub doesn't have default account to login in the docker. I am not sure if it is possible because of security reasons.
I have found the following Docker use-cases in GA (I could miss something): 1.
jobs:
my_first_job:
steps:
- run: docker pull alpine:3.11
This use-case allow customers to do login by theirselves:
steps:
- run: |
docker login --username ${{ secrets.docker_username }} --password ${{ secrets.docker_password }}
docker pull alpine:3.11
2)
jobs:
container:
runs-on: ubuntu-latest
container: node:10.16-jessie
steps:
- run: |
echo This job does specify a container.
echo It runs in the container instead of the VM.
name: Run in container
Looks like pull of such container is happening at the first step of pipeline and we can't login before somehow.
3.
jobs:
my_first_job:
runs-on: ubuntu-latest
steps:
- uses: docker://alpine:3.11
this use case will be affected too because even if we invoke
steps:
- run: docker login
- uses: docker://alpine:3.11
Container pull happens at the first step of the build before docker login
4.
jobs:
my_first_job:
runs-on: ubuntu-latest
steps:
- uses: my_custom_action_that_uses_docker_inside@v1
In the first use-case, customer can overcome limit by login to docker as a first step of pipeline: docker login --username ${{ secrets.docker_username }} --password ${{ secrets.docker_password }}
.
But looks like use-cases 2,3,4 will be broken.
DockerHub will use IP rate limit so our customers could be affected. Docker will gradually impose download rate limits with an eventual limit of 100 pulls per six hours for anonymous users.
As I can see, DockerHub even provides separate documentation section about rate limits in GitHub Actions. But actually, they are asking action owners to build-in docker login
as a part of their actions and it will fix use case 4.
I don't have much experience with docker so could be wrong, may be actions/runner team can help us to understand if there is a way to login in use-cases 2,3,4. cc: @hross , @ericsciple , @TingluoHuang
cc: @thejoebourneidentity @alepauly @bryanmacfarlane
@maxim-lobanov ,
Thanks for your reply.
For the use case 3
and 4
, can we add a step to login docker before the 'uses' step? Similar to the use case 1
you mentioned.
@BrightRan , for 4 - yes. For 2 and 3 it is not possible because looks like pulling of all images happen before all steps. This yaml produces the following steps:
steps:
- run: docker login
- uses: docker://alpine:3.11
@maxim-lobanov , Yes, you're right. I just have tried it, and the results confirm your statement. Thanks for your detailed explanation.
The great thing about this thread is...
We are actually actively working on providing more native login capabilities in actions. See this ADR. This should make it easier for customers to set up authentication for scenarios 2
and 3
.
@dakale is adjusting the syntax with @chrispat (today, actually). We should have this ready to go by the end of the month at the latest, which should mean better support before the docker limits are imposed.
Hey @dakale can you add each of these examples to your ADR (1-4) and explain how they can be addressed with the new syntax?
@hross , Happy to hear that we are working on feature that will fix the issue. Also I think we will need a good communication with our customers to migrate them to new approach with auth. We should avoid case when all docker related builds start to fail one day, and customers will have to fix their builds in rush to unblock builds.
I agree with the scenarios as described by @maxim-lobanov , I think it accurately reflects all the ways we support containers. This only affects 2
and 3
. Those are where we pre-pull containers at the start of the job, thus giving no place to run commands to log in. 1
and 4
will be fine as users can run docker login
in step if needed
For 2
and 3
, we dont currently support pulling with authentication. That previously only meant you couldnt use "private" images in those scenarios, but now it means those scenarios may be impacted by this rate limiting issue. A single solution that allows the pull to be authenticated solves both
I should note that (due to my mistake), the link to the ADR @hross posted is out of date. We have iterated on it since, and the change is significant relative to this issue. Check the PR for the most up to date document: https://github.com/github/c2c-actions-runtime/pull/813/files
That ADR, as currently designed, only solves 2
, which currently is impossible to work around (except on self-hosted). It does not address 3
at all, nor does it add anything for 1
, 4
, but those have viable workarounds. Thats because its designed to solve a different problem that would have accidentally helped the rate limiting problem too
I have no way to gauge what the impact of the rate limiting will be, since I dont know the range of IPs used for the hosted runner vms or how much of the available rate they might consume, but 100 per 6 hours is very very low. Its hard to make a judgement on whether its worth revising the ADR again to account for rate limiting, but I do think its worth considering given how low the limit is.
We're also thinking about this in Azure Pipelines. We already have a technical solution for scenarios 2 and 3 (container resources can use a service connection for authentication). Most of our current usage is anonymous, though, so we'll have to help customers understand the impact and adapt.
Judging from the ADRs being private, will GitHub users not get to see the syntax until it's out (and can't have breaking changes)? Since the rate limit would almost definetely affect users, would it be worth adding a deprecation notice once a fix is available eventually forcing all users to pull while being logged in? Could help solve problems before they arise if people new to GitHub Actions start using it and expect it to work without knowing about the limit or that public CIs don't get some kind of whitelist
@nihaals the syntax is very simple as we are simply adding a credentials element to the job.container
and job.services
key. We are also trying to understand from Docker if the library images are impacted by these changes. Based on our telemetry we believe that will have the biggest impact in how many users are immediately impacted.
I made what I think is a related feature request that may make it nicer to login with Actions which sounds like will work with the new syntax
What do you think about the deprecation notice on unauthorised Docker Hub images? Probably blocked by finding out if official images are exempt and the new syntax
As I understand it one scary and unresolved problem with (1) is going to be sharing Hub credentials with forked repos. Any solution that involves a docker login
execution will not work, because (a) secrets are not available to forked repos, and (b) we have no other safe way to provide credentials to a docker login
invocation.
Does the solution that's being worked on address this aspect?
At the moment we do not have a solution to that particular problem but we are investigating options.
After the latest GitHub update, you can specify credentials for container downloading: https://github.blog/changelog/2020-09-24-github-actions-private-registry-support-for-job-and-service-containers/
@chrispat sorry to press on this, but is there any news on solutions to the forked repo problem I mentioned above?
@maxim-lobanov
After the latest GitHub update, you can specify credentials for container downloading
I read the blog post. The new feature, being able to authenticate with registries, looks like it will have the nice side effect of solving the rate limiting problem for job containers and service containers by allowing people to authenticate, which would stop them from being forced to share the rate limit pool with other GitHub Actions users.
However, do you know whether it will work for Docker container actions? For example, from the docs (https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-syntax-for-github-actions#example-using-a-docker-hub-action):
jobs:
my_first_job:
steps:
- name: My first step
uses: docker://alpine:3.8
In this example, the image used for the Docker container action is being pulled by the GitHub Actions runner from Docker Hub. Because credentials aren't specified for the step, I'm assuming this is being done without authenticating with Docker Hub.
The images used for Docker container actions must be public, which means people don't have to authenticate to use them, but if they choose not to authenticate as they use them, they'll be subject to the IP-based rate limiting, affected by other GitHub Actions users. Is this correct?
@chrispat sorry to ask again, but it's getting worryingly close to November and, as far as I can tell, there's still no safe way for open source projects to use authenticated access to Docker Hub.
Has there been any progress in finding a way for forks of repos to be able to inherit Docker Hub authentication from the original repo?
For publicly accessible containers we are working with docker hub to make sure you will not be impacted by the new rate limits. If you need private containers we still do not have a solution for that problem for forks of public repos.
Thank you @chrispat! As a maintainer of an open source project that uses public Docker images, that's really good news.
For publicly accessible containers we are working with docker hub to make sure you will not be impacted by the new rate limits.
@chrispat just to confirm, does this mean we will NOT need to add dockerhub auth to all of our jobs that pull public dockerhub images?
That is correct. It is important to note that this will only apply to the runners hosted by GitHub, if you are using self-hosted runners you will need to configure your own credentials.
Thank you for the confirmation! :)
That is correct. It is important to note that this will only apply to the runners hosted by GitHub, if you are using self-hosted runners you will need to configure your own credentials.
@chrispat Will this be also applicable to Azure Pipelines hosted agents?
@lizan for now this will apply for Azure Pipelines. However, given you have the ability to enter your own credentials for containers in Azure Pipelines I would encourage you to do that. One of the main reasons we feel this is important for GitHub Actions is to better support our open source customers, and a lot of the individual actions in the marketplace are themselves implemented as containers.
Azure Pipelines tasks by contrast have no formal support for being implemented as a container and while we know some open source projects do use it we are encouraging them to move to GitHub Actions.
Thanks for the clarification, that helps.
Azure Pipelines tasks by contrast have no formal support for being implemented as a container and while we know some open source projects do use it we are encouraging them to move to GitHub Actions.
Yeah I'm asking on behalf of Envoy OSS project :) and we don't have good story to move to GHA for various reasons.
Did this issue reach a conclusion? Where can I find the answer to the original question, "Did Dockerhub rate limit affect Github Action?"
@magthe , https://github.com/actions/virtual-environments/issues/1445#issuecomment-713861495
For publicly accessible containers we are working with docker hub to make sure you will not be impacted by the new rate limits. If you need private containers we still do not have a solution for that problem for forks of public repos. It is important to note that this will only apply to the runners hosted by GitHub, if you are using self-hosted runners you will need to configure your own credentials.
Yes, I read that, but the wording makes it sound like it's in progress.
Should I interpret the closing of this ticket as there's a finalised an agreement with Docker Hub that ensures GitHub Actions aren't impacted by the rate limits?
@magthe yes we resolved this through an agreement with Docker.
I saw the docker pull limit error on few of the github actions today which build an image from a Dockerfile.
@chrispat, is there any public announcement/documentation of this Docker agreement? it would be helpful to understand what is included/excluded (does this apply to all github repo actions/workflows except self-hosted runners? does it apply to public and private repos? etc) Thanks!!
Does this just work by nature of dockerhub allowing unlimited pulls from builds running on github actions (by hostname/ip?)
This seemed to work for me, but I now have some GitHub workflows that use docker-compose to start some services, and they fail with the message:
Pulling db (mariadb:10.3)...
10.3: Pulling from library/mariadb
error pulling image configuration: errors:
unauthorized: authentication required
unauthorized: authentication required
Is this a temporary situation or is the agreement between GitHub and Docker Hub permanently amended?
It looks like the situation has already resolved itself again, my builds work again when I re-run the jobs. Confirmed on multiple repositories.
We are actually actively working on providing more native login capabilities in actions. See this ADR. This should make it easier for customers to set up authentication for scenarios
2
and3
.@dakale is adjusting the syntax with @chrispat (today, actually). We should have this ready to go by the end of the month at the latest, which should mean better support before the docker limits are imposed.
@hross Is it still on the roadmap? I can't access your link. The agreement announced by @chrispat is great for workflow working on GitHub runner, but what about self-hosted runner? Is there at least a documented syntax to allow user login?
@XavierChapron , see https://github.blog/changelog/2020-09-24-github-actions-private-registry-support-for-job-and-service-containers/ it should help
@maxim-lobanov Thanks a lot! It was exactly what I was needing.
@chrispat, is there any public announcement/documentation of this Docker agreement? it would be helpful to understand what is included/excluded (does this apply to all github repo actions/workflows except self-hosted runners? does it apply to public and private repos? etc) Thanks!!
So we still don't know what the exact agreement terms are but for at least public repos, anonymous pulls from github don't seem to hit any rate limit anymore. I'm guessing some caching is also happening because why would it not.
But rate limits are not the end of the story. I found this FAQ https://forums.docker.com/t/automated-docker-pulls-from-github-hosted-runner-which-organization-must-pay/116095
Q: I want to run an automated agent that makes container requests on behalf of my organization. Which license do I need? A: Automated agents or service accounts that make container image requests of Docker Hub must be licensed under a Docker Team subscription.
This does not say anything about rate limits, it merely says "automated agent". Who is the "automated agent"? Github or the open-source project using github to docker pull anonymously?
I think the issue with rate limit has landed again, has the agreement between GitHub and Docker been terminated or what has happened?
In my workflow (or actually in my own composite action but that does not really affect the functionality) I am using a fork of Docker(file) based action (https://github.com/simple-elf/allure-report-action) which uses openjdk:8-jre-alpine
as the base image. This week I started getting these error when the Docker image of the allure-report-action is built:
Step 1/10 : FROM openjdk:8-jre-alpine
toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
The problem would be easy to solve if I could do docker login before the docker build happens but as the build will happen before any steps in the workflow (or the composite action) are executed, it is not possible. Is there already some solution for this kind of case, or should the agreement of Docker Hub not limiting pulls from GitHub runners (which I have been using) for an anonymous user be still in force?
I could also use our company's self-hosted runners but it will not remove the issue because there are so many docker pulls to Docker Hub from our company's IP also that the limits are reached (tried that). I could even use our company's own Docker Hub registry cache (by adding the cache host to the action's Dockerfile, before the image name) but that would require me to login to the cache registry before the docker image build, and we are back in the issue of not being able to have any login step before the Docker based action image build.
Some projects like this one publish their images to https://ghcr.io too:
https://github.com/zephyrproject-rtos/docker-image/blob/1a0e3737905/README.md
Given that choice, we switched from docker.io to ghcr.io for that particular image. Interestingly, we didn't observe any speedup but we (naively?) hope this will avoid all rate limiting issues now and in the future.
@marc-hb Do you have some Docker(file) based actions (https://docs.github.com/en/actions/creating-actions/creating-a-docker-container-action) which are using base images from a ghrc.io or some other private Docker registry? How are you able to login to the registry before the action is built in the workflow since none of the steps in the workflow run before the base image is tried to be pulled? I could use a base image from our company's own Docker registry if I just could login there before action is built, and the base image needed.
How are you able to login to the registry before the action is built
We don't login, that's why we switched to the ghcr.io mirror of that particular image. I have never set up a ghcr.io mirror myself but I assumed it's not rocket science if you're already publishing images somewhere and an alternative worth mentioning.
Ok, didn't know that it is possible to pull images from a ghcr.io mirror without any login on GitHub runners but now I know. The issue is just that the base image in the Dockerfile is an official openjdk image which I was not able to find being available in ghcr.io. And, even though it is of course always possible to set up an own user for container registry, it is even more challenging than rocket science to make that happen in such a big and stiff corporation I am working for.
I think the issue with rate limit has landed again, has the agreement between GitHub and Docker been terminated or what has happened? In my workflow (or actually in my own composite action but that does not really affect the functionality) I am using a fork of Docker(file) based action (https://github.com/simple-elf/allure-report-action) which uses
openjdk:8-jre-alpine
as the base image. This week I started getting these error when the Docker image of the allure-report-action is built:Step 1/10 : FROM openjdk:8-jre-alpine toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
The problem would be easy to solve if I could do docker login before the docker build happens but as the build will happen before any steps in the workflow (or the composite action) are executed, it is not possible. Is there already some solution for this kind of case, or should the agreement of Docker Hub not limiting pulls from GitHub runners (which I have been using) for an anonymous user be still in force?
I could also use our company's self-hosted runners but it will not remove the issue because there are so many docker pulls to Docker Hub from our company's IP also that the limits are reached (tried that). I could even use our company's own Docker Hub registry cache (by adding the cache host to the action's Dockerfile, before the image name) but that would require me to login to the cache registry before the docker image build, and we are back in the issue of not being able to have any login step before the Docker based action image build.
@ssallmen-pro Facing the same issue with this package now (allure-report-action). Did you find a solution by any chance?
I'm also seeing this today. Did anything change?
@SvetaSR @chancez The issue persisted for a week or maximum two in October '22, but then it went away and I did not see it coming again even though the same docker image was downloaded multiple times a day. Currently I am working in another assignment and do not use GitHub Actions at the moment at all. Let's hope it will be just a temporary issue again.
I'm seeing this in larger runners of GitHub Actions. https://github.com/lablup/backend.ai/actions/runs/6846124540/job/18612305349?pr=1712
@chrispat are you still working on Actions?
Yesterday we hit some rate limits for public Docker images (node
in particular) from our GitHub-hosted runners. We do not have particularly high-volume usage.
I couldn't find a mention of https://github.com/actions/runner-images/issues/1445#issuecomment-713861495 in the documentation, and GitHub Support also didn't seem to know about this agreement. Is this documented somewhere and I just missed it?
Sharing part of the response I got from GitHub Support:
[...] certain groups of runners are currently excluded from the rate limiting exception - most notably Arm64 and GPU runners.
This is obviously disappointing, but at least we can work around it for now.
@smcgivern Did they share any timeline? Is this a known issue they are working to fix?
We started using the ARM runners and ran into this problem.
Associated community ticket: https://github.community/t/did-dockerhub-rate-limit-affect-github-action/128158
The customer noticed that Docker hub has updated their pricing, Download rate limit and retention policy recently. He is wondering if the download rate limit affects on Github Action. Will the steps like below are affected?
According to the introductions from the docs about "Download rate limit", the limit seems only applies for the anonymous users with an eventual limit of 100 pulls per six hours. Logged in users will not be affected at this time.
I setup a workflow with the following step that pull an image 101 times in a loop.
It can successfully download the image 101 times without any warning or error messages about the download rate limit. Note: Consider there may be image cache, I have used the "docker image rm" command to remove the image every time before downloading it again.
Question: When we try to pull some public images in the workflow, if we do not login with any of our accounts, whether GitHub has a default account to login the docker? If not, in my above example, why it is not hit the download rate limit?