docker / hub-feedback

Feedback and bug reports for the Docker Hub
https://hub.docker.com
232 stars 40 forks source link

docker pull by sha256 leads to "500 Internal Server Error" #2043

Open aakhundov opened 3 years ago

aakhundov commented 3 years ago

Problem description

Pulling certain (not all) images from the Docker Hub by their sha256 leads to the "500 Internal Server Error" HTTP status:

$ docker pull tensorflow/tensorflow@sha256:20a4b7fa03e7c7432d4a0515c89590edf98587ad20c8bbe8216a42a19fca013a
Error response from daemon: received unexpected HTTP status: 500 Internal Server Error

The above sha256 hash is indicated at the Docker Hub page of the tensorflow/tensorflow:1.13.1-gpu-py3 image. Same happens with the hash shown on the page of the tensorflow/tensorflow:1.13.1-gpu-py3-jupyter image:

$ docker pull tensorflow/tensorflow@sha256:26b85a1c88925a9c4858306b8a001bff52f085201bcb35b15bface23ed4449c3
Error response from daemon: received unexpected HTTP status: 500 Internal Server Error

In the same time, pulling other images by sha256 hash works just fine. E.g., tensorflow/tensorflow:1.15.3-gpu-py3:

$ docker pull tensorflow/tensorflow@sha256:0a724af02977a2235dcff293241b2ce06ea159a81959d6b4cd3dd206827cf83d
sha256:0a724af02977a2235dcff293241b2ce06ea159a81959d6b4cd3dd206827cf83d: Pulling from tensorflow/tensorflow
f08d8e2a3ba1: Already exists
3baa9cb2483b: Already exists
94e5ff4c0b15: Already exists
1860925334f9: Already exists
05cc64cc481f: Already exists
b11f037be8e8: Already exists
24379c211bf5: Already exists
fafab3cf92ad: Pull complete
a6d8786ac6ef: Pull complete
aea0265b6ec5: Pull complete
0423ef434f4a: Pull complete
4b32d1c1f700: Pull complete
87ae2be39428: Pull complete
2cdded2749af: Pull complete
def5db6f3f6a: Pull complete
97e7a2abc382: Pull complete
Digest: sha256:0a724af02977a2235dcff293241b2ce06ea159a81959d6b4cd3dd206827cf83d
Status: Downloaded newer image for tensorflow/tensorflow@sha256:0a724af02977a2235dcff293241b2ce06ea159a81959d6b4cd3dd206827cf83d
docker.io/tensorflow/tensorflow@sha256:0a724af02977a2235dcff293241b2ce06ea159a81959d6b4cd3dd206827cf83d

As the behaviour described above is consistently demonstrated across different machines, systems, and docker daemon versions, this seems to be a Docker Hub issue. And because pulling images by their sha256 is important for reproducibility of containerised workloads, I'd kindly ask you to take a look at it.

wcedmisten commented 3 years ago

It looks like this is an issue of docker hub displaying the wrong digest.

$ docker pull tensorflow/tensorflow:1.13.1-gpu-py3
1.13.1-gpu-py3: Pulling from tensorflow/tensorflow
34667c7e4631: Pull complete 
d18d76a881a4: Pull complete 
119c7358fbfc: Pull complete 
2aaf13f3eff0: Pull complete 
643564d518c8: Pull complete 
1fea03e629a4: Pull complete 
45402f4cf61d: Pull complete 
45f7c407b07b: Pull complete 
00e5163fe3e0: Pull complete 
7d071071ef98: Pull complete 
5119bdada1e4: Pull complete 
64a9355aa772: Pull complete 
21f5ce47fe21: Pull complete 
5566cd4bac12: Pull complete 
58c608a4c711: Pull complete 
Digest: sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
Status: Downloaded newer image for tensorflow/tensorflow:1.13.1-gpu-py3
docker.io/tensorflow/tensorflow:1.13.1-gpu-py3

docker pull shows a different digest of sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5 whereas Docker hub shows: sha256:20a4b7fa03e7c7432d4a0515c89590edf98587ad20c8bbe8216a42a19fca013a

I can successfully pull the image with the digest from docker pull:

$ docker pull tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5: Pulling from tensorflow/tensorflow
34667c7e4631: Pull complete 
d18d76a881a4: Pull complete 
119c7358fbfc: Pull complete 
2aaf13f3eff0: Pull complete 
643564d518c8: Pull complete 
1fea03e629a4: Pull complete 
45402f4cf61d: Pull complete 
45f7c407b07b: Pull complete 
00e5163fe3e0: Pull complete 
7d071071ef98: Pull complete 
5119bdada1e4: Pull complete 
64a9355aa772: Pull complete 
21f5ce47fe21: Pull complete 
5566cd4bac12: Pull complete 
58c608a4c711: Pull complete 
Digest: sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
Status: Downloaded newer image for tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
docker.io/tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5

May also be related to this issue: https://github.com/docker/hub-feedback/issues/1925, although there is only 1 platform for this image, so I'm not sure why it would have a different digest from the one given by docker pull.

wcedmisten commented 3 years ago

Just to provide another example for reproduction, I also get this issue with continuumio/miniconda3:4.6.14

$ docker pull continuumio/miniconda3@sha256:6b5cf97566c3b1d8bfd4ff1464fbdaaa9d9737c26d1b153eb3e88358d3826c48
Error response from daemon: received unexpected HTTP status: 500 Internal Server Error

Compare this with the docker hub page for continuumio/miniconda3:4.7.12, which I can pull using the digest provided on the Docker Hub page (sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159). This digest also matches the digest shown when pulling by tag:

$ docker pull continuumio/miniconda3:4.7.12
4.7.12: Pulling from continuumio/miniconda3
Digest: sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159
Status: Downloaded newer image for continuumio/miniconda3:4.7.12
docker.io/continuumio/miniconda3:4.7.12

Successfully pulling by digest which is consistent with both digests:

$ docker pull continuumio/miniconda3@sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159
sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159: Pulling from continuumio/miniconda3
Digest: sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159
Status: Downloaded newer image for continuumio/miniconda3@sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159
docker.io/continuumio/miniconda3@sha256:6c979670684d970f8ba934bf9b7bf42e77c30a22eb96af1f30a039b484719159
rulatir commented 3 years ago

Why is this not critical priority?

Woodz commented 3 years ago

There are two major usability issues here:

  1. The error message received unexpected HTTP status: 500 Internal Server Error when the digest does not exist is very confusing
  2. The digests do not match, so it is impossible to pin a dependency on a specific image
TheDeepestSpace commented 3 years ago

Just to add to the discussion here, seems like docker manifest inspect can be used to look up the SHAs too, and it seems to provide the correct data.

Plugging in OPs image shows the same hash as the one mentioned by @wcedmisten:

$ docker manifest inspect tensorflow/tensorflow:1.13.1-gpu-py3 -v
{
        "Ref": "docker.io/tensorflow/tensorflow:1.13.1-gpu-py3",
        "Descriptor": {
                "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                "digest": "sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5", <<<<
                "size": 3466,
                "platform": {
                        "architecture": "amd64",
                        "os": "linux"
                }
        },
        "SchemaV2Manifest": {
                "schemaVersion": 2,
                "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                "config": {
                        "mediaType": "application/vnd.docker.container.image.v1+json",
                        "size": 12261,
                        "digest": "sha256:20a4b7fa03e7c7432d4a0515c89590edf98587ad20c8bbe8216a42a19fca013a" <<<<
                },
                "layers": [
                        ...
                ]
        }
}

I am not sure how to properly interpret the manifest, but I've successfully used the same approach for other images that I could not pull using Docker Hub's digest.

rsun-thoughtworks commented 2 years ago

Why this bug is still here.

nathan-tsien commented 2 years ago

Why this bug is still here. digest: sha256:e4105db9d4690c236b378feec3c07e3dbcc9efbd7e4e51d0a5df9a3b01b9e372

angerson commented 2 years ago

I'm on the TensorFlow team. Something that looks related has happened with our other ecosystem containers. I'm running Docker version 20.10.2, build 2291f61. Here's one specific instance that happened in the past:

Pulling the container displayed a checksum that is different from what the daemon reports. I can pull this.

$ docker pull tensorflow/build:latest-python3.9
...
3a9169dd7cd4: Pull complete
Digest: sha256:2050b4cd265fcd1f7f91645fe943ccf543eed3b221bed094d9690e4b0c535410
Status: Downloaded newer image for tensorflow/build:latest-python3.9

$ docker pull tensorflow/build:latest-python3.9@sha256:2050b4cd265fcd1f7f91645fe943ccf543eed3b221bed094d9690e4b0c535410
docker.io/tensorflow/build@sha256:2050b4cd265fcd1f7f91645fe943ccf543eed3b221bed094d9690e4b0c535410: Pulling from tensorflow/build

I can't pull the image that the daemon identifies, however.

$ docker images --no-trunc --quiet tensorflow/build:latest-python3.9
sha256:325ebc582eb5331bfa9dd786f4e93d955c59dbc7da8cdbcfffadad69c3a32119

$ docker pull tensorflow/build:latest-python3.9@sha256:325ebc582eb5331bfa9dd786f4e93d955c59dbc7da8cdbcfffadad69c3a32119
Error response from daemon: received unexpected HTTP status: 500 Internal Server Error

I dug into the issue more today after discovering this. Docker gives inconsistent digest definitions. After freshly pulling the latest version of the same container, it gave me one good checksum during the pull:

$ docker pull tensorflow/build:latest-python3.9
latest-python3.9: Pulling from tensorflow/build                                                          
Digest: sha256:c7b402951d74492af2d846655e7cf12581c855ca480de4515939ef4dba39eacd 

The same digest is provided with a simple docker images query:

$ docker images --digests                                                                                
REPOSITORY         TAG                DIGEST                                                                    IMAGE ID       CREATED          SIZE                                                               
tensorflow/build   latest-python3.9   sha256:c7b402951d74492af2d846655e7cf12581c855ca480de4515939ef4dba39eacd   08097b650214   27 minutes ago   10.4GB  

But if I request that tag specifically, the digest is missing:

$ docker images 'tensorflow/build:latest-python3.9' --digests
REPOSITORY         TAG                DIGEST    IMAGE ID       CREATED          SIZE
tensorflow/build   latest-python3.9   <none>    08097b650214   30 minutes ago   10.4GB

$ docker images 'tensorflow/build:latest-python3.9' --format '{{.Digest}}'
<none>

If I request the digest through --no-trunc --quiet then the checksum is totally wrong. Pulling it gives a 500 Server Error.

$ docker images --no-trunc --quiet tensorflow/build:latest-python3.9
sha256:08097b650214c73dce510e1bbedc7def83976a8c2d0a61d4429a1e46f7903eec
thaJeztah commented 2 years ago

I need to fully read up on the thread, but some quick replies on the last comment;

I can't pull the image that the daemon identifies, however.

The --quiet option will print the image's ID. This is also a digest, but for the local image's content, and is only available locally (not known by the registry, so cannot be used to pull the image). The digest that's shown when pulling/pushing an image is calculated over the image manifest, and will be different than that ID.

But if I request that tag specifically, the digest is missing:

I think that may be a bug in the presentation on the CLI (I need to dig up the related tickets), which can happen if multiple image tags point to the same image. In that case only a single image exists, but multiple tags refer to that image; when running docker images / docker image ls, the list of image tags is "expanded" to show as individual images in the output. However, there's a bug/race condition when filtering the results, which could sometimes lead to the image being shown without the corresponding digest.

thaJeztah commented 2 years ago

Related tickets; https://github.com/moby/moby/issues/40636 (and https://github.com/docker/roadmap/issues/262)

angerson commented 2 years ago

Thanks for clarifying --quiet, @thaJeztah. I added a comment to this Stack Exchange answer where I got confused at first, which I hope helps others.

thaJeztah commented 2 years ago

You're welcome! It's definitely confusing with all those digests (all looking the same 😅). We were discussing this internally, and I think the Docker Hub team will try to put https://github.com/moby/moby/issues/40636 / https://github.com/docker/roadmap/issues/262 on an upcoming sprint, but I'd definitely recommend to upvote the roadmap ticket (as it may help prioritising).

For completeness;

This is also a digest, but for the local image's content, and is only available locally (not known by the registry, so cannot be used to pull the image)

Looks like I was partially wrong there (as in: the digest is included in the image's config section), but that digest cannot be used to pull the image 😅.

kupietools commented 2 years ago

Is there a fix or step-by-step instructions for those of use who are not docker or CLI experts, but merely trying to install software that is only available through docker and getting blocked by this error?

thaJeztah commented 2 years ago

If you must pull by digest, you can use the docker manifest inspect or if you have buildx installed, the docker buildx imagetools inspect command to get the digest of the manifest. For example, for the tensorflow/tensorflow:1.13.1-gpu-py3 mentioned in the original description;

With docker manifest inspect with the -v / --verbose option (I'm using jq to only print the digest here);

docker manifest inspect -v tensorflow/tensorflow:1.13.1-gpu-py3 | jq .Descriptor.digest
"sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5"

With docker buildx imagetools inspect;

$ docker buildx imagetools inspect tensorflow/tensorflow:1.13.1-gpu-py3

Name:      docker.io/tensorflow/tensorflow:1.13.1-gpu-py3
MediaType: application/vnd.docker.distribution.manifest.v2+json
Digest:    sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
$ docker pull tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
docker.io/tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5: Pulling from tensorflow/tensorflow
34667c7e4631: Pull complete
d18d76a881a4: Pull complete
119c7358fbfc: Pull complete
2aaf13f3eff0: Pull complete
....
Digest: sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
Status: Downloaded newer image for tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5
docker.io/tensorflow/tensorflow@sha256:0f949ccc690d9c50e9b46b16d9030c2c6845af621c2e7e82c4bf59803edc69b5