microsoft / containerregistry

Microsoft Artifact Registry description and related FAQ
398 stars 89 forks source link

Mutability and retention policy #141

Closed achamayou closed 10 months ago

achamayou commented 1 year ago

This is a few questions rather than an issue:

  1. Are tags immutable in MAR? If I pull the same tag multiple times, can I always expect the same image?
  2. How long are tagged images retained?
  3. How long are untagged images retained?
MichaelSimons commented 1 year ago

Are tags immutable in MAR? If I pull the same tag multiple times, can I always expect the same image?

Tags are mutable by design. For example the .NET images utilize multiple floating tags that each have different characteristics. The following tags all point to the same image today.

6.0.16-alpine3.17-amd64, 6.0-alpine3.17-amd64, 6.0-alpine-amd64, 6.0.16-alpine3.17, 6.0-alpine3.17, 6.0-alpine | [Dockerfile](https://github.com/dotnet/dotnet-docker/blob/nightly/src/runtime/6.0/alpine3.17/amd64/Dockerfile) | Alpine 3.17

They are referred to as floating because the tags will be moved to newer images based on different conditions. For example any tag with a 6.0 version (no patch specified) will be moved to the latest servicing versions as they are released. Tags with alpine (no version) will moved to the latest version of alpine once it is released.

Moving the tags is beneficial to users. If you want to always be on the latest .NET version then you would want to use a 6.0 tag. If you additionally want to stay on Alpine 3.17 then you should use the 6.0-alpine3.17 tag.

Providing a wide variety of tags with differing behavior allows the user to pick the behavior that is appropriate for their scenario.

achamayou commented 1 year ago

@MichaelSimons thank you for the clarification. In the example you describe, some tags (such as 6.0) are very clearly frequently mutated to point to the latest release. Is this also true of the most detailed tag (6.0.16-alpine3.17-amd64 here), is that effectively always guaranteed to resolve to the same image, or could any tag be subject to change at any time?

If so, then it would follow that there may eventually be images that are no longer pointed to by any tags (but are still be accessible by digest). What happens to such images? Are they forever addressable by digest? Do they get deleted?

Moving the tags is beneficial to users.

Absolutely!

However, for the specific purpose of reproducing a build, it is useful to be able to fetch exactly the base image that was used. If all tags are mutable, my question reduces to "are images ever deleted? if so, under what conditions?".

mus65 commented 1 year ago

@MichaelSimons I would also like to know what the official policy is regarding untagged images. We are referencing sha256 for build reproducibility and today some of our pipelines randomly started failing with the following error:

ERROR: Job failed: failed to pull image "mcr.microsoft.com/dotnet/sdk:6.0.407@sha256:8189c53e1f0323b8beb5fa174853f413364a203f32b0b7c2488b534a3c1e76c6" with specified policies [if-not-present]:
 error parsing HTTP 400 response body: invalid character 'ï' looking for beginning of value: 
"\ufeff<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Error><Code>InvalidUri</Code><Message>The request URI is invalid.\nRequestId:8171d089-001e-0017-02c5-00b808000000\nTime:2023-10-17T06:43:38.5579534Z</Message></Error>" (manager.go:237:0s)

I could also (sometimes) reproduce this with docker pull mcr.microsoft.com/dotnet/sdk:6.0.407@sha256:8189c53e1f0323b8beb5fa174853f413364a203f32b0b7c2488b534a3c1e76c6 . Using the latest mcr.microsoft.com/dotnet/sdk:6.0.407 (which resolves to sha256:ef18259e9c0570d28b874b6680b13d83cf0c6c2e7e2a2509a339aab5e88c16b4) always works, so it looks to me like the old image was deleted in the registry.

edit: it seems like not the image itself is deleted but only some layers fail to download:

$ docker pull mcr.microsoft.com/dotnet/sdk:6.0.407@sha256:8189c53e1f0323b8beb5fa174853f413364a203f32b0b7c2488b534a3c1e76c6
mcr.microsoft.com/dotnet/sdk@sha256:8189c53e1f0323b8beb5fa174853f413364a203f32b0b7c2488b534a3c1e76c6: Pulling from dotnet/sdk
3f9582a2cbe7: Already exists
d866aec6058e: Already exists
11332129480d: Downloading
9f9b514859b0: Download complete
b709e83c5e9e: Download complete
95bf3c712b69: Download complete
bfbd60288f07: Download complete
67b8902cebc8: Download complete
error parsing HTTP 400 response body: invalid character 'ï' looking for beginning of value: "\ufeff<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Error><Code>InvalidUri</Code><Message>The request URI is invalid.\nRequestId:69fede8a-101e-009a-3ed8-003046000000\nTime:2023-10-17T09:01:46.2345213Z</Message></Error>"
MichaelSimons commented 1 year ago

@MichaelSimons I would also like to know what the official policy is regarding untagged images. We are referencing sha256 for build reproducibility and today some of our pipelines randomly started failing with the following error:

Referencing images via sha is supported scenario as it is often required for isolation purposes.

achamayou commented 1 year ago

@MichaelSimons I think the question is: how long are images that are untagged kept? In other words, if you access something by sha, can you expect it to always be there, or will it get cleaned up after a period of time? If so, how long?

MichaelSimons commented 1 year ago

To date, we do not cleanup any of the published .NET images. I don't think this will be the case forever. If there is a change to this policy, we will communicate it as an announcement in the dotnet-docker repo.

achamayou commented 1 year ago

@MichaelSimons thank you!