testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
7.96k stars 1.64k forks source link

Docker Hub pull rate limits #3099

Open rnorth opened 4 years ago

rnorth commented 4 years ago

N.B. This issue description will be updated as new information becomes available.

As of 2020-08-13, Docker have updated their terms of service and pricing page, indicating that:

The page on rate limits clarifies that this applies ~to layers being pulled, not~ images. ~Since most images comprise multiple layers, the effective image pull rate limit will be very low.~

The rate limit page is inconsistent about the number, but also states that these rate limits are being introduced gradually.

Based on a recent build it does not appear that these rate limits are being enforced yet. However, it seems as though they will be in effect by 2020-11-01, so we need to understand the implications for various groups.

Implications

For open source projects using Testcontainers within their builds (including Testcontainers itself)

For companies using Testcontainers within their builds

Actions for users

Actions for Testcontainers team

Open questions

rnorth commented 4 years ago

One idea: for fork builds on GitHub Actions, we could look at using GitHub Packages as a cache for required images. Images pushed to GitHub Packages should be readable on forks, so the main (automatable) work would be:

Both could be accomplished using the pluggable image substitution mechanism that we have in mind.

I'll add an action above for us to investigate.

vlsi commented 4 years ago

AFAIK, GitHub registry requires API token even for image pulls from public repositories, so I'm not sure images would be readable on forks.

rnorth commented 4 years ago

@vlsi you might be right, so I'd like to do some investigation before putting a lot of effort in. My understanding of the docs was that forks can access a GITHUB_API_TOKEN secret which is a different, read-only, token. It's quite possible that I'm wrong.

WtfJoke commented 4 years ago

@vlsi, @rnorth is correct the GITHUB_TOKEN provided by github action has read only access on forked repos.

See https://docs.github.com/en/actions/configuring-and-managing-workflows/authenticating-with-the-github_token#permissions-for-the-github_token

fongie commented 4 years ago

just fyi, this started happening for us today in aws codebuild in our pipeline, tests fail because limit exceeded on docker hub, so some sort of rates are being enforced already (not waiting until november)

rnorth commented 4 years ago

@fongie that's alarming and disappointing - my understanding was that this was not being applied yet.

Based on a test branch (#3098) I get the impression that GitHub Actions and CircleCI might not be getting rate limited yet.

The advice would remain the same - to use an authenticated account for pulling or to use a mirror/copy in another registry. I'm saddened that we've not had time to release our image substitution yet, though.

rnorth commented 4 years ago

Updated above with link to https://www.docker.com/blog/scaling-docker-to-serve-millions-more-developers-network-egress/ which clarifies:

From my perspective, two major topics remain unresolved:

@fongie I've reached out to a contact at Docker to see if we can get clarification on when the rate limiting is supposed to be taking effect.

fongie commented 4 years ago

Great! Sorry for not being more specific last post, I was busy trying to get our release out hehe

This is the error message we got: Caused by: com.github.dockerjava.api.exception.DockerClientException: Could not pull image: error pulling image configuration: toomanyrequests: Too Many Requests. Please see https://docs.docker.com/docker-hub/download-rate-limit/

in AWS CodeBuild, trying to pull 'postgres:10.7'

It worked on the third attempt simply by retrying. I understand that there is the option to set up an authenticated account, I just dont have the time for that right now, but I would guess this is happening to more people and you might get more questions about this issue, so it might be nice to prepare thorough instructions for how to do it. If I log in to another docker registry in my codebuild environment, will testcontainers automatically use that to pull from when I do new PostgreSQLContainer("postgres:10.7"), or do I need to instruct testcontainers on which registry to use?

rnorth commented 4 years ago

@fongie sorry for the slow response. If you log in to Docker Hub then Testcontainers will be able to use that hub account for pulling.

If you log in to another docker registry (e.g. ECR) then that does not help directly, but it could be made to work. ECR does not function as a pull-through cache (AFAIK), so you'd have to copy the required image into ECR and use new PostgreSQLContainer("your.ecr.registry.amazonaws.com/somepath/postgres:10.7") to create the container. This is a bit of a pain, which is why I'm working on #3102 to at least help with the second aspect.

Logging in to Docker Hub requires fewer changes.

rnorth commented 4 years ago

For GitHub Actions users, this issue looks worth following: https://github.com/actions/virtual-environments/issues/1445

rnorth commented 3 years ago

Really good news for GitHub Actions users: https://github.com/actions/virtual-environments/issues/1445#issuecomment-713861495

aaronjwhiteside commented 3 years ago

Would it not make sense to configure a registry mirror the same way native docker does?

We have Nexus setup to mirror the official DockerHub registry, and we have configured our Jenkins slaves to point towards this mirror..

# docker info
....
 Registry Mirrors:
  https://<internal_company_mirror_here>:5000/
...

Although it's nice that there will be a piece of callback code that we can hook into to modify the image name before it's fetched, would it not be easier to allow a registry mirror to be configured in ~/.testcontainers.properties that would act as the default host to fetch images from that do not explicitly specify a host in the image name?

This seems like what 99% of people would want, and it would stop everyone implementing the same code to do it.

bsideup commented 3 years ago

@aaronjwhiteside registry mirror is the easiest way to "fix" the rate limits issue, yes. There is no need to implement any code for it and it can be set in Docker's settings already today.

Also, there is #3413 that implements a default, prefixing substitutor, for those who don't have access to Docker's settings.

aaronjwhiteside commented 3 years ago

@bsideup Interesting, we have a registry mirror set but I still see rate limit errors while fetching images, I'll have to dig into our build system and try and figure out what is going on.

3413 looks promising!

brianwyka commented 3 years ago

I'm seeing the same behavior as @aaronjwhiteside on our internal network with registry mirrors setup.

@rnorth, @bsideup we are prefixing our containers, such as kafka with our internal registry, which Testcontainers is picking up, however we still get the rate limit errors. Something else we need to do?

bsideup commented 3 years ago

@brianwyka see https://www.testcontainers.org/supported_docker_environment/image_registry_rate_limiting/

brianwyka commented 3 years ago

@bsideup, That circularly brought me back here 😆 . We are on testcontainers 1.14.x. Maybe we need to update to 1.15.0 ??

bsideup commented 3 years ago

@brianwyka oh, yes, definitely. 1.14.x is from pre-ratepocalypse era :D

aaronjwhiteside commented 3 years ago

@brianwyka @bsideup I think what is happening is that when docker gets an error while pulling from the configured registry mirror is falls back to going directly to dockerhub, I checked our nexus logs and found it was receiving the rate limit error too..

I'm not sure if this is documented behaviour, I haven't checked, though a little unexpected it kinda makes sense.

poznas commented 3 years ago

we solved the issue by forcing docker-java (which is used under the hood) library to point to the right registry

src/test/resources/docker-java.properties

registry.url=your.artifactory.domain/some-path/
rnorth commented 3 years ago

That's really interesting. I wonder if we could/should obtain the daemon's registry URL from the info endpoint and use that automatically. That may not be universally applicable though...

At the very least we should document this.

brianwyka commented 3 years ago

@poznas, were using that docker-java.properties configuration in addition to testcontainers.properties. I ran into the rate limit problem with just the docker-java.properties configuration present...

poznas commented 3 years ago

@brianwyka, for one of our repos this solution also did not work. Tired of trying to track down the cause, I just threw testcontainers away 😁 I replaced it with a slightly more manual approach: gradle-docker-plugin

dargiri commented 3 years ago

@poznas IMHO moving away from testcontainers to gradle docker plugin & maven docker plugin is 2 steps back. It decreases significantly development experience.

Fraserhardy commented 3 years ago

Are there any plans to publish test containers to the ECR public gallery: https://aws.amazon.com/about-aws/whats-new/2020/12/announcing-amazon-ecr-public-and-amazon-ecr-public-gallery/

This could be a good solution for many who have their CI on AWS environments as it provides free data transfer if you're on AWS.

bsideup commented 3 years ago

@Fraserhardy testcontainers/* is exempt from rate limits on Docker Hub already.

jvegarag commented 3 years ago

@bsideup now that testcontainer images are whitelisted in Docker Hub, can this issue be considered as closed? do you still recommend to update to 1.15.1 and use the "image name prefix" solution to point to an internal registry? Thanks

gubbaraviteja commented 3 years ago

@bsideup We are getting rate limit issues in aws codeBuild. If testcontainers/* are exempt from rate-limit, I don’t understand why we are getting this error. Below are the logs.

we are using testcontainers 1.15.1

2021-01-19T09:57:18,596 546 [main] WARN o.t.u.TestcontainersConfiguration - Attempted to read Testcontainers configuration file at file:/root/.testcontainers.properties but the file was not found. Exception message: FileNotFoundException: /root/.testcontainers.properties (No such file or directory) 
2021-01-19T09:57:18,634 584 [main] INFO o.t.d.DockerMachineClientProviderStrategy - docker-machine executable was not found on PATH ([/root/.goenv/shims, /root/.goenv/bin, /go/bin, /root/.phpenv/shims, /root/.phpenv/bin, /root/.pyenv/shims, /root/.pyenv/bin, /root/.rbenv/shims, /usr/local/rbenv/bin, /usr/local/rbenv/shims, /root/.dotnet/, /root/.dotnet/tools/, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, /bin, /opt/tools, /usr/local/android-sdk-linux/tools, /usr/local/android-sdk-linux/tools/bin, /usr/local/android-sdk-linux/platform-tools, /codebuild/user/bin]) 
2021-01-19T09:57:19,445 1395 [main] INFO o.t.d.DockerClientProviderStrategy - Found Docker environment with local Unix socket (unix:///var/run/docker.sock) 
2021-01-19T09:57:19,492 1442 [main] INFO o.t.utility.ImageNameSubstitutor - Image name substitution will be performed by: DefaultImageNameSubstitutor (composite of 'ConfigurationFileImageNameSubstitutor' and 'PrefixingImageNameSubstitutor') 
2021-01-19T09:57:20,429 2379 [docker-java-stream-1239536715] ERROR c.g.d.a.a.ResultCallbackTemplate - Error during callback 
com.github.dockerjava.api.exception.InternalServerErrorException: Status 500: {"message":"toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit"} 
at org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:247) 
at org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269) 
at java.base/java.lang.Thread.run(Thread.java:834)

should we copy the testcontainer images to our internal registry and use the 'image name prefix' solution?

adambriny commented 3 years ago

@Fraserhardy testcontainers/* is exempt from rate limits on Docker Hub already.

@bsideup That's good news, but many internally used images are not from that repo, like alpine or localstack.

I think a single property in testcontainers.properties would be a handy way to set a registry for all the internal images listed in org.testcontainers.utility.TestcontainersConfiguration.

Because now I have to do this one-by-one in testcontainers.properties, like:

ambassador.container.image=my-registry.com/richnorth/ambassador
socat.container.image=my-registry.com/alpine/socat
vncrecorder.container.image=my-registry.com/testcontainers/vnc-recorder
...etc...

which is cumbersome and fragile for changes.

Something similar to the docker-java solution, but I don't want to directly depend on testcontainers' third-parties either.

we solved the issue by forcing docker-java (which is used under the hood) library to point to the right registry

src/test/resources/docker-java.properties

registry.url=your.artifactory.domain/some-path/
bsideup commented 3 years ago

@adambriny see https://www.testcontainers.org/features/image_name_substitution/#automatically-modifying-docker-hub-image-names

adambriny commented 3 years ago

Thank you @bsideup! It's my fault, I was looking for the rate limit keywords only.. 🤦‍♂️ Maybe it would be easier to find if it was mentioned on https://www.testcontainers.org/supported_docker_environment/image_registry_rate_limiting/ as a possible solution...

bsideup commented 3 years ago

@adambriny good idea!

P.S. contributions are more than welcome 😊

bademux commented 3 years ago

How about KISS registryMirrors soution? https://github.com/GoogleContainerTools/jib/tree/master/jib-gradle-plugin#properties

perrin4869 commented 2 years ago

I just tried to use ECR as the testcontainers prefix (TESTCONTAINERS_HUB_IMAGE_NAME_PREFIX=public.ecr.aws/), but unfortunately the images aren't publically available there. I found some random public repositories in ECR such as TESTCONTAINERS_HUB_IMAGE_NAME_PREFIX=public.ecr.aws/bigeye/ which host a mirror of the testcontainer images, but since those aren't official, I can't use them at work... is there any chance in the future testcontainers could setup a public repo in ECR as well?

rnorth commented 2 years ago

Hi @perrin4869 The testcontainers org is in Docker's open source program, so is supposed to be exempt from rate limits... Are you seeing rate limiting occurring?

rnorth commented 2 years ago

(Just to clarify, this means that images like testcontainers/ryuk should be exempt. Other images may still have rate limits)

PSanetra commented 2 years ago

@rnorth As far as I can see, the ryuk image is not part of the open source program. The open source program label is missing on the docker hub page:

Comparing: https://hub.docker.com/r/testcontainers/ryuk is missing that label https://hub.docker.com/r/fluent/fluent-bit has that label