kubernetes-retired / rktlet

[EOL] The rkt implementation of the Kubernetes Container Runtime Interface
Apache License 2.0
137 stars 43 forks source link

Dockerhub image name canonicalization #49

Closed euank closed 7 years ago

euank commented 8 years ago

Right now, dockerhub image names (e.g. busybox) are represented by two names ("busybox:latest" and "registry-1.docker.io/library/busybox:latest").

This causes excessive image pulls and inconsistencies in image service pull vs list.

This problem doesn't exist with non-dockerhub images (e.g. "quay.io/coreos/alpine-sh:latest" has only one representation).

We should fix that here or in rkt.

cc @lucab @dgonyeo, if you have rkt/docker2aci context you want to link

euank commented 8 years ago

Relates to #39, another case where we mess up on having a canonical name.

lucab commented 8 years ago

I'm quite sure we have this kind of inconsistencies internally, but I'm missing a bit of context here: where are you seeing the short version? is rkt returning/storing it at some point or is it some mismatch between k8s and the rktlet?

s-urbaniak commented 8 years ago

I just verified this in rkt itself, indeed the image is downloaded twice:

$ rkt fetch --insecure-options=image docker://busybox:latest
Downloading sha256:56bec22e355 [===============================] 668 KB / 668 KB
sha512-3a90c39f3ebecf6bf468947c7e024ea2
$  rkt fetch --insecure-options=image docker://registry-1.docker.io/library/busybox:latest
Downloading sha256:56bec22e355 [===============================] 668 KB / 668 KB
sha512-3a90c39f3ebecf6bf468947c7e024ea2
lucab commented 8 years ago

Ah, I was checking with image listing, whose behavior is different:

$ rkt fetch --insecure-options=image docker://busybox:latest
image: remote fetching from URL "docker://busybox:latest"
Downloading sha256:56bec22e355 [===============================] 668 KB / 668 KB
sha512-73e6b9a5c5a4e876f59febc3994de8bd
$ rkt image list | grep busybox
sha512-73e6b9a5c5a4     registry-1.docker.io/library/busybox:latest             1.2MiB  8 seconds ago   8 seconds ago
euank commented 8 years ago

Kubernetes will only use the short version and expects to see that e2e. We don't process the output of rkt list in rktlet so we don't show k8s what it expects. That's the most important mismatch.

CRI pull sees the short name, CRI list sees the long one. The simplest fix is converting everything to the short name at the CRI boundary, but maybe there's a better fix for this if we add it a layer down in rkt.

euank commented 7 years ago

I'm picking this up because it's breaking me quite a bit right now.