Cannot pull from istag referencing image in another repository of internal registry

miminar commented 8 years ago

Reproducer:

$ docker tag docker.io/hello-world 172.30.104.152:5000/tmp/hello-world:latest
$ docker push 172.30.104.152:5000/tmp/hello-world:latest
The push refers to a repository [172.30.104.152:5000/tmp/hello-world] (len: 1)
0a6ba66e537a: Pushed 
b901d36b6f2f: Pushed 
latest: digest: sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1 size: 2748

$ oc tag -n tmp hello-world@sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1 mybase:latest
Tag mybase:latest set to hello-world@sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1.

*$ oc describe -n tmp istag
Name:           sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Created:        2 minutes ago
Labels:         <none>
Annotations:    openshift.io/image.managed=true
Docker Image:   172.30.104.152:5000/tmp/hello-world@sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Image Name:     sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Image Size:     633 B (first layer 601 B, last binary layer 32 B)
Image Created:  4 months ago
Author:         <none>
Arch:           amd64
Command:        /hello
Working Dir:    <none>
User:           <none>
Exposes Ports:  <none>
Docker Labels:  <none>

Name:           sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Created:        2 minutes ago
Labels:         <none>
Annotations:    openshift.io/image.managed=true
Docker Image:   172.30.104.152:5000/tmp/hello-world@sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Image Name:     sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Image Size:     633 B (first layer 601 B, last binary layer 32 B)
Image Created:  4 months ago
Author:         <none>
Arch:           amd64
Command:        /hello
Working Dir:    <none>
User:           <none>
Exposes Ports:  <none>
Docker Labels:  <none>

$ # note the first pull is successful because all the blobs are already downloaded in Docker daemon's storage
$ docker pull 172.30.104.152:5000/tmp/mybase:latest                                                                                                                                                                                           
Trying to pull repository 172.30.104.152:5000/tmp/mybase ... latest: Pulling from tmp/mybase
Digest: sha256:bb5c16ecd58633d4088140acb6ee386c5c3b09a8f03c517a288093415f0c09d1
Status: Downloaded newer image for 172.30.104.152:5000/tmp/mybase:latest

$ docker rmi -f 0a6ba66e537a
Untagged: 172.30.104.152:5000/tmp/hello-world:latest
Untagged: 172.30.104.152:5000/tmp/mybase:latest
Untagged: docker.io/hello-world:latest
Deleted: 0a6ba66e537a53a5ea94f7c6a99c534c6adb12e3ed09326d4bf3b38f7c3ba4e7
Deleted: b901d36b6f2fd759c362819c595301ca981622654e7ea8a1aac893fbd2740b4c
$ # the second pull needs to repull all the blobs and fails
$ docker pull 172.30.104.152:5000/tmp/mybase:latest
Trying to pull repository 172.30.104.152:5000/tmp/mybase ... latest: Pulling from tmp/mybase
b901d36b6f2f: Pulling fs layer 
0a6ba66e537a: Pulling fs layer 
not found
Error: image tmp/mybase:latest not found

Tested with Docker 1.8.2 and 1.9.1. Test to confirm this is in #7790.

0xmichalis commented 8 years ago

Do we want to somehow link the invalid istag location in the registry to the valid one?

miminar commented 8 years ago

Update: outdated! , see https://github.com/openshift/origin/issues/7792#issuecomment-199229554

@legionus can you please verify you can pull with older Docker version (like 1.6)? So we know it's not a regression in the registry but a way, how docker works nowadays. This used to work in past ages.

If this is indeed caused by a change in Docker's behaviour, then it's a hard one. The solution is to create layer links in the repository referenced by IST, ISI or docker image for all the layers referenced in image manifest.

We could either:

implement a controller in origin server that would re-push all the ImageStreamTags, ImageStreamImages and DockerImages referencing internally managed images.
- slow
- requires docker daemon to first pull and then push
do the same in the registry in some kind of a controller
- fast
- would just create links in repositories to existing blobs
do the same lazily in the registry
- on demand
- no controller
- when handling GET request on a manifest whose namespace/name doesn't match requested

In all three cases we'd have to annotate the tag references to avoid subsequent reads of registry storage. My favorite is 3.

Update: I didn't mention the most obvious solution to this - resolve the repository in pullthroughblobstore if not found locally. We'd have to list all the ISTs, ISIs and DockerImages of the is corresponding to the repository to see what where else do we point to and redirect the request to the right blobstore. But this would be an efficiency killer.

legionus commented 8 years ago

@legionus can you please verify you can pull with older Docker version (like 1.6)? So we know it's not a regression in the registry but a way, how docker works nowadays. This used to work in past ages.

@miminar It seems to regression. Take a look at the screencast:

https://gist.github.com/legionus/47b5ff3435a5daa16bb9

miminar commented 8 years ago

@legionus Commented on the gist. It's necessary to remove all the references to the image you're about to pull from the ist.

legionus commented 8 years ago

@miminar I've tested origin (v1.1.4-12-g77757b8) with docker (9d26a07/1.6.0, e949a81/1.10.3). Both are unable to pull from istag referencing image.

0xmichalis commented 8 years ago

Talked with @legionus offline and we realized a possible solution would be that when our registry detects a "not found" response, it could query the apiserver for the missing image (oc get is), look into its tags, find the real link, and serve that. @miminar WDYT?

miminar commented 8 years ago

@kargakis yeah, that's what outlined earlier. It's super easy but not much efficient. It has to be done for every layer of the image being pulled. You can speed it up with some sort of caching though.

pweil- commented 8 years ago

@miminar @kargakis @legionus @soltysh Based on our discussions in the planning meeting we need to confirm that this is, in fact, the same thing as our image linking card. If it is then it sounded like we had consensus to lower the priority on this so it isn't a blocker

cc @smarterclayton

miminar commented 8 years ago

Verification will be hard without first fixing this. But I'll give it a shot.

pweil- commented 8 years ago

If the solution you've discussed here (looking back to the IS) is low risk and approved then I'm fine with fixing it. Just need a concrete plan to close it or lower the priority. Thanks!

smarterclayton commented 8 years ago

So prior to us adding pullthrough, this worked. If adding Pullthrough broke it, then this is a regression. I suspect it's because we don't rewrite the ref on the status tag to point to the local image stream, but need to confirm. If I can sort the registry auth issue I'll look at this next.

miminar commented 8 years ago

So prior to us adding pullthrough, this worked.

@legionus confirmed this is an issue even with origin v1.0.1 so it seems to me like a very old bug rather than a regression

smarterclayton commented 8 years ago

Well, let me say then "this is a regression from the design of the core feature of the product" and "if we had detected it at any point along the way it would have been an immediate P0" :)

On Fri, Mar 18, 2016 at 10:20 AM, Michal Minar notifications@github.com wrote:

So prior to us adding pullthrough, this worked.

@legionus https://github.com/legionus confirmed this is an issue even with origin v1.0.1 so it seems to me like a very old bug rather than a regression

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/7792#issuecomment-198380545

legionus commented 8 years ago

This patch fixes the problem. Please look at it and if everything is correct, I will make a PR.

https://github.com/legionus/origin/commit/df70700d417f1740d8d26ff28d14ad1add985cb5

soltysh commented 8 years ago

@legionus just open a PR with the fix It's easier to comment on the code. Will look into it in the afternoon. On Mar 20, 2016 1:38 AM, "Alexey Gladkov" notifications@github.com wrote:

This patch fixes the problem. Please look at it and if everything is correct, I will make a PR.

legionus@df70700 https://github.com/legionus/origin/commit/df70700d417f1740d8d26ff28d14ad1add985cb5

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/7792#issuecomment-198815882

legionus commented 8 years ago

Now I think it's the wrong solution. I'll try to find the better solution.

miminar commented 8 years ago

Here's what we're pursuing right now:

We're adding a new middleware wrapper (e.g. localMultiplexerBlobStore) that will multiplex between local repositories in order to serve requested blob living in some other repository referenced in corresponding image stream.

The resulting blob store will be a composition of:

pullghroughBlobStore{ localMultiplexerBlobStore{ quotaRestrictedBlobStore{ upstreamBlobStore } } }

It will:

implement Stat and ServerBlob methods
behave like this:
1. pass the request down to the quotaRestictedBlosStore
2. if not found, it will determine, whether the blob is linked in any other local repository referenced by currently requested IS; ~~user must be authorized to access the target repository~~ user must be authorized only to access imagestreams/layers of the IS he's pulling from, he doesn't need the permissions on the target repository
3. it will instantiate a repo object for the repository and pass the request to it

What about caching? Two approaches:

map blob digests :arrow_right: found local repository names
- subsequent requests will hit the cache
link the layers of the found image in current repository
- subsequent requests will be handled by the underlying blob store
- there's a drawback that layers will stay linked even after the istags to other local repositories are deleted; this will be partially addressed by reworked pruning (blobs will stay linked until a periodic garbage collection runs)

liggitt commented 8 years ago

We need to be super-careful we don't escalate permissions by way of cross-namespace imagestreamtag references

smarterclayton commented 8 years ago

Whoa... that's crazy complex. That is not what I had in mind.

smarterclayton commented 8 years ago

We need to be handling this in the ImageStream, not in the registry. We have to carry enough metadata around "relationship" from the import -> tag -> tag across namespaces so that the registry can do an O(1) check to lookup resources.

soltysh commented 8 years ago

To address the general idea of images we have in OpenShift today, which is: "any time you can see an image, you can pull it", we need to fix the ImageStreamTag to allow pulling any image. Currently only tags created within single ImageStream are the subject of successful pull. Whenever ImageStreamTag references other ImageStream (within single project or other one) docker pull of that image will fail. The root cause of the problem is that docker registry tries to download layers for the tagged image, which are not present, since ImageStreamTag is a weak reference. The solution is to transform our tags into strong references. This will allow to have O(1) lookup in registry @smarterclayton wants. The possible solutions are:

Update oc tag command with pulling and re-pushing the image with the new tag. Unfortunately this solution is the heaviest possible solution, has backwards compatibility issues, and is practically unusable from a regular user point of view, since the operation will be quite lengthy. Moreover, it doesn't address ImageStreamTags already present in the system.
Modify image controller in such a way that newly created tags will be picked and image data will be copied into the registry under new tag. @miminar mentioned this will require some changes in the registry code to handle the copying.

Issues/problems that needs to be addressed:

With solution no 2. what happens if original image the tag points to is modified. As long as we're pointing to ImageStreamImage we're fine, if we're pointing to ImageStreamTag then we need to update the reference as well, how? Is using scheduled import for the tag reasonable solution? What if somebody turn this off?
@liggitt mentioned also his concerns around access escalation that are introduced with tagging. Although the overall mechanics will stay as is, we can address partially additional checks, if possible.
If the original image is a pull-through (iow. referencing external registry), with secret and the pull does not happen until tagging. Even more between now and first pull (using the tag) the secret is removed the pull will fail, since the original image wasn't yet downloaded to the registry.
As part of any solution we need to address existing tags in a state that would cause breakage. An upgrade utility or have the controller be able to address these tags on the fly?

After talking to @pweil- @legionus @miminar we're leaning towards modifying controller (solution no. 2), but we all agree the amount of work needed to implement this is pretty big. Additionally the risk of introducing this feature is too high to be addressed for 1.2 thus we propose to turn this bug into trello card which will be targeted for 1.3, especially that our tests has shown this feature never worked before, so it's not a regression.

@smarterclayton @danmcp can we have agreement from your side?

smarterclayton commented 8 years ago

For an image that is tagged into openshift (via the local registry), when it is tagged into another image stream (where it points to REGISTRY/NAMESPACE/ISTAG, we should be mutating the status -> ref to point to REGISTRY/NEW_NAMESPACE/NEW_ISTAG. If we do that, then I believe this fixes the immediate, 1.2 critical feature.

soltysh commented 8 years ago

@smarterclayton that still would require us copying data in the registry and doesn't address the issue when original tag has changed.

smarterclayton commented 8 years ago

Why does it require copying data in registry? All we have to change the is the destination status tag.

miminar commented 8 years ago

@smarterclayton Modification of status doesn't address the issue. It's not the problem of serving the manifest. Docker daemon fetches the manifest and its blobs from the same repository regardless of what name is stored in the manifest itself. The problem is the lack of layer links in the repository being just tagged. Repository created by a push to a registry has layer links like this:

docker/registry/v2/repositories/miminar/
├── busybox
│   ├── _layers
│   │   └── sha256
│   │       ├── a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
│   │       │   └── link
│   │       └── eeee0535bf3cec7a24bff2c6e97481afa3d37e2cdeff277c57cb5cbdb2fa9e92
│   │           └── link

However, they aren't present in te repository created by oc tag. When there's an isimage ns/is:tag refering miminar/busybox@... and the daemon requests one of its blobs, registry's local blob store tries to read it from docker/registry/v2/ns/is/_layers/sha256/eeee0535bf3cec7a24bff2c6e97481afa3d37e2cdeff277c57cb5cbdb2fa9e92/link and obviously fails.

So either we create the missing layer links in the registry for all the isimages and istags or we switch to the referenced local repository by re-creating the blob store during each request.

smarterclayton commented 8 years ago

We shouldn't need layer links. We already know what layers we have access to.

On Mon, Mar 21, 2016 at 4:19 PM, Michal Minar notifications@github.com wrote:

@smarterclayton https://github.com/smarterclayton Modification of status doesn't address the issue. It's not the problem of serving the manifest. Docker daemon fetches the manifest and its blobs from the same repository regardless of what name is stored in the manifest itself. The problem is the lack of layer links in the repository being just tagged. Repository created by a push to a registry has layer links like this:

docker/registry/v2/repositories/miminar/ ├── busybox │ ├── _layers │ │ └── sha256 │ │ ├── a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 │ │ │ └── link │ │ └── eeee0535bf3cec7a24bff2c6e97481afa3d37e2cdeff277c57cb5cbdb2fa9e92 │ │ └── link

However, they aren't present in te repository created by oc tag. When there's an isimage ns/is:tag refering miminar/busybox@... and the daemon requests one of its blobs, registry's local blob store tries to read it from docker/registry/v2/ns/is/_layers/sha256/eeee0535bf3cec7a24bff2c6e97481afa3d37e2cdeff277c57cb5cbdb2fa9e92/link and obviously fails.

So either we create the missing layer links in the registry for all the isimages and istags or we switch to the referenced local repository by re-creating the blob store during each request.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/7792#issuecomment-199459101

smarterclayton commented 8 years ago

What does the layer link point to in this case?

smarterclayton commented 8 years ago

I think we should be bypassing the link.

miminar commented 8 years ago

What does the layer link point to in this case?

To the registry's blob store in docker/registry/v2/blobs where all the blobs are stored like this:

v2/blobs/<algorithm>/<first-2-chars-of-hex>/<hex>/data

I think we should be bypassing the link.

We could do it for the price of carrying patches. But I'm concerned about performance. If the requested blob is local, its existence check is easy - just a Stat() on the link file. If we bypass the symlinks completely, we still need to verify that the requested blob belongs to requested repository, which implies enumeration of all images in corresponding image stream. Doing that for every single layer of an image being pulled.

smarterclayton commented 8 years ago

We already know whether the blob belongs to the repository - because it's referenced by the manifest. The manifest (OpenShift Image) is the system of authority that the blob is owned by a user. We already have to fetch the image stream tag to prove ACL.

On Tue, Mar 22, 2016 at 3:42 AM, Michal Minar notifications@github.com wrote:

What does the layer link point to in this case?

To the registry's blob store in docker/registry/v2/blobs where all the blobs are stored like this:

v2/blobs////data

I think we should be bypassing the link.

We could do it for the price of carrying patches. But I'm concerned about performance. If the requested blob is local, its existence check is easy - just a Stat() on the link file. If we bypass the symlinks completely, we still need to verify that the requested blob belongs to requested repository, which implies enumeration of all images in corresponding image stream. Doing that for every single layer of an image being pulled.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/7792#issuecomment-199681436

miminar commented 8 years ago

True, but this way user will be able to download blobs belonging to arbitrary images he doesn't have access to - if he somehow obtains their digests. I fear of security consequences. Until now image manifest was something you could publicly share without worries that someone unathorized could actually pull it and run it. This will change with bypassing the links.

smarterclayton commented 8 years ago

That's the contract of OpenShift - we verify you have access to the manifest up front (via auth to a registry) and then we verify as you tags things around. That's how we design the system. We should never have been checking the links.

On Tue, Mar 22, 2016 at 1:03 PM, Michal Minar notifications@github.com wrote:

True, but this way user will be able to download blobs belonging to arbitrary images he doesn't have access to - if he somehow obtains their digests. I fear of security consequences. Until now image manifest was something you could publicly share without worries that someone unathorized could actually pull it and run it. This will change with bypassing the links.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/openshift/origin/issues/7792#issuecomment-199908887

miminar commented 8 years ago

I see. That will make things much simpler (also for image pruning). The only problem will be the need to carry custom distribution patch(es).

smarterclayton commented 8 years ago

Effectively we already committed to that by storing image metadata in etcd. This is the logical conclusion.

pweil- commented 8 years ago

Summary of today's conversations:

This carries too much risk to get in to 3.2 with the proposed solution - we shouldn't block
We should still consider this a p0 item and must be in the next patch release
The solution for this should not require carrying patches - we need to propose changes upstream if they must be made

pweil- commented 8 years ago

@miminar @legionus - please make sure this gets a "known issues" mention in the release notes since we are not merging this for 3.2.

legionus commented 8 years ago

Since the #8938 is merged this issue must be solved.

smarterclayton commented 8 years ago

Can we create an extended test to verify this continues to work? Or add it to e2e?

On Jun 26, 2016, at 5:04 PM, Alexey Gladkov notifications@github.com wrote:

Since the #8938 https://github.com/openshift/origin/pull/8938 is merged this issue must be solved.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/origin/issues/7792#issuecomment-228622810, or mute the thread https://github.com/notifications/unsubscribe/ABG_p_-2NU_A20acLg5qtfMQ7UET3pfCks5qPulzgaJpZM4HpVm0 .

legionus commented 8 years ago

@smarterclayton Yes. In fact @soltysh already did it. I will create PR on Monday.

legionus commented 8 years ago

https://github.com/openshift/origin/pull/8326/commits

soltysh commented 8 years ago

Yeah, just copy that one commit and open a PR and we'll be good :)

pweil- commented 8 years ago

bump, is the test for this merged? Can we close?

legionus commented 8 years ago

I think yes

pweil- commented 8 years ago

tests are submitted in #9580, the root issue has been merged. Closing.

openshift / origin

Cannot pull from istag referencing image in another repository of internal registry #7792