Closed gesellix closed 3 years ago
@gesellix do you have a mirror configured? Or is this a response from Docker Hub?
I suppose that the 404 is the local Docker Engine and no lookup to any registry is made. In this case I don't use any private registry, the engine is directly connecting to the Docker Hub. Manual testcontainers/ryuk:0.3.0
fixes the issue.
As far as I know the docker cli had some logic like this for docker run
(pseudo code):
try {
create _container(image)
}catch(e){
if e.status == 404 {
pull_image(image)
create _container(image)
}
}
I don't understand, though, why we now run into such an issue. The only thing I'm aware of is https://github.com/docker/cli/pull/1498, which might be related.
Did/does Testcontainers pull images, in this case ryuk, before trying to create containers?
I'm hitting this today too. In my case, for a Postgres container. I tried setting ryuk.testcontainer.image=testcontainersofficial/ryuk:0.3.0
. It couldn't pull that image either. They definitely exist in Docker Hub though.
The other thing I suspected was maybe we're hitting the Docker Hub pull limits? I thought since this is communicating with the daemon it should use the auth configured on the host, but possibly I'm misunderstanding.
... so maybe the pull for ryuk is performed unauthenticated?
That's what I was worried about, yea. But I'm not sure if that's what's going on or not yet.
@gesellix we definitely pull the image if it is not available.
The other thing I suspected was maybe we're hitting the Docker Hub pull limits?
In that case, an error would differ (unless the Docker Hub team have decided that 404
is a perfect http status code for a rate limited response, which should be 5xx instead)
When we got the error before, from running docker commands directly in a job, we got a message that specifically said we hit pull limits. But I don't know the HTTP status that the docker binary received in that case, so I wasn't sure if the message was possibly being hidden by TestContainers or not.
While our GitHub Actions still work (same Testcontainers version, but differend Docker Engine/operating system), I guess this is mainly related to Docker for Mac. I can give it a try with an older Docker4Mac release tomorrow.
In my case, it's passing locally on Mac with latest Docker for Mac stable (but I have those images in my cache though) and failing on GitLab.
404 and {"message":"No such image: testcontainers/ryuk:0.3.0"}
is what we actually get from the API.
Also, testcontainers/*
images are exempt from rate limiting, or at least that's what they told us :)
oh wait, I think I know what is it...
Well, that rules out that possibility then, at least. Maybe Docker Hub is having some problem? I just tried disabling ryuk and then it said 404 with No such image: alpine:3.5
For me Docker Hub seemed to be ok, docker pull *ryuk
made it work for me... well... maybe the other images have been in the local cache 🤔
Good point. docker pull
didn't break for me either locally.
Ok, "filter by image name" query parameter in /images/json
got removed, and now this condition fails:
https://github.com/testcontainers/testcontainers-java/blob/d135a2605401f6c663aab4e7edc6d6d76716f930/core/src/main/java/org/testcontainers/DockerClientFactory.java#L330
I just submitted #3575 with a fix, will be included in 1.15.1
I suppose I should mention too that I had tried with 1.15.0-rc2 and 1.15.0.
Ok, "filter by image name" query parameter in
/images/json
got removed, and now this condition fails:I just submitted #3575 with a fix, will be included in 1.15.1
Ah, so a change in Docker Hub API?
I deleted ryuk image locally and ran test again, oddly passed again.
@keeganwitt Docker's API. Although the query param was deprecated (I wish we could run Docker in a strict API mode - will explore)
Sorry for this. We will release a hotfix ASAP. Meanwhile, consider pre-pulling testcontainers/ryuk:0.3.0
and alpine:3.5
:(
@keeganwitt Docker's API. Although the query param was deprecated (I wish we could run Docker in a strict API mode - will explore)
Sorry for this. We will release a hotfix ASAP. Meanwhile, consider pre-pulling
testcontainers/ryuk:0.3.0
andalpine:3.5
:(
Yea, I'd thought of that, but I'm not sure it's possible with GitLab's Docker Executor. It should be possible to run it as a script with shell executor instead though I suppose. I'm still confused why it worked locally after deleting the ryuk image though... Maybe different Docker daemon versions?
Thanks @bsideup for the quick fix!
Are you planning to backport this to work also with junit 4?
@DaspawnW this is not junit specific and, once released, will work with any type of integration (junit4, junit jupiter, spock, manual container lifecycle)
@keeganwitt did you ever find a reasonable workaround for builds running in Gitlab? We have been looking at this for a day now without much success. It works if you manually pre-pull the images, but we are using docker-machine to autoscale the runners in EC2, so manual work is not really an option.
@bsideup I am seeing this also in 1.14.0.
@jdelucaa yes, this Docker API change applies to most of Testcontainers versions.
@keeganwitt did you ever find a reasonable workaround for builds running in Gitlab? We have been looking at this for a day now without much success. It works if you manually pre-pull the images, but we are using docker-machine to autoscale the runners in EC2, so manual work is not really an option.
Not really. I have exactly the same setup. For now, we just commented the tests out, since a fix is forthcoming. A few ideas came to mind, but I haven't really thought through them yet.
None of these seemed great. If one of them sounds promising, I can explain in a little more detail what I was thinking, though there may be gotchas I haven't thought of. Offhand, the user data script seems like the most promising to me.
We thought about option 1, but quickly discarded that idea. We also tried option 2 but apparently Docker isn’t installed at that point yet, so didn’t really proceed further with that. Neither 3 or 4 felt like a good idea, so I guess we’ll just skip the tests using testcontainers for now.
Thanks for sharing, though.
@arhohuttunen I'm now thinking this broke because we upgraded GL Runners this week, which upgraded Docker version. So downgrading should fix that. Unless others didn't upgrade and still ran into this? I could have sworn we had tests pass after the upgrade, but I'm not sure what else could have changed.
I upgraded Docker for Mac to 3.0 (which has Docker 20.10 in it) this morning, and the tests now fail locally too.
@keeganwitt I think on the runners we are using the docker stable tag, which should still point to 19.03.14 according to this: https://hub.docker.com/layers/docker/library/docker/stable/images/sha256-8f71deccd0856d8a36db659a8c82894be97546b47c1817de27d5ee7eea860162?context=explore
@arhohuttunen Sorry, I didn't mean the runners image, I meant the machines on which the runner image runs (where the daemon lives). We use https://github.com/npalm/terraform-aws-gitlab-runner, which would upgrade that. This applies to private runners, not the shared ones that GL manages. I dunno what schedule those are upgraded on, we don't use them.
I've been talking with another of our engineers and he said userdata isn't the same as ec2-userdata. I didn't realize there were 2.
We use a hard-coded AMI in our runners and haven't changed that lately, so that should not be the root cause in our case.
Actually, some builds passed after our upgrade too, so it shouldn't have been that. I'm confused why yesterday was the breaking day.
I'll take previous comment back because those runners install Docker from official repo which does serve 20.10.
I'm encountering this issue on apaceh CI builds for the apache/james project (https://builds.apache.org/blue/organizations/jenkins/james%2FApacheJames/detail/PR-268/16/pipeline) I tried pulling the image explicitly before running the tests but it still fails. we are very much looking forward to the 1.15.1 hotfix
edit I was misled by a comment above that referred to testcontainersofficial instead of testcontainers I assume this was some kind of custom setup. The tests try to get testcontainers/ryuk:0.3.0 by default no testcontainersofficial/ryuk:0.3.0
@jeantil That's correct, testcontainers is the default, testcontainersofficial was just a misguided thing I attempted early on (overriding the image in testcontainers properties). I saw testcontainersofficial mentioned in an issue where they were discussing Docker Hub and Quay. Sorry for the confusion.
When can we expect to have this 1.15.1 release ? (In order to know whether we need to find a workaround for this or just wait for it)
@mderouet the release is expected for later today (tsss ;))
released in 1.15.1 🎉
It works, thanks
I got an update from Docker today, and I got Docker desktop-3.0.1 (50773).
Now I get an error while running Test container.
...... <<< ERROR! org.testcontainers.containers.ContainerLaunchException: Container startup failed Caused by: org.testcontainers.containers.ContainerFetchException: Can't get Docker image: RemoteDockerImage(imageName=postgres:11.7, imagePullPolicy=DefaultPullPolicy()) Caused by: com.github.dockerjava.api.exception.NotFoundException: {"message":"No such image: testcontainersofficial/ryuk:0.3.0"}
I assumed you fixed it. I also disabled the Use gRPC FUSE for file sharing option, but it didn't help.
How can I fix this ?
@alex-sky-cloud this is another issue, unrelated to the file sharing, and it is fixed in 1.15.1, please update.
I'm sorry. What should I update ?
@alex-sky-cloud the project you're reporting to - Testcontainers :D
I have the latest version of docker.
Which project should I update ?
I executed the
command and only after that the error disappeared.
So, is this how you will need to do it every time ?
@alex-sky-cloud No. Just use the latest (1.15.1) version of Testcontainers.
This issue is still live for me when I tried using this on a Spring Boot/Postgres app. The fix of separating downloading the 'testcontainer' docker image works.
I would suggest adding a note to the docs to explain that this is a pre requiste
@davoutuk with Testcontainers 1.15.1? There was a bug that got fixed in 1.15.1, there isn't such prerequisite
Testcontainers 1.15.0 on Docker Engine 20.10/Docker for Mac 2.5.4 fails with the following stacktrace: