opencontainers / distribution-spec

OCI Distribution Specification
https://opencontainers.org
Apache License 2.0
811 stars 202 forks source link

How are registries expected to behave when a subject is deleted? #378

Open jonjohnsonjr opened 1 year ago

jonjohnsonjr commented 1 year ago

I see:

When deleting a manifest that has an associated referrers tag schema, clients MAY also delete the referrers tag when it returns a valid image index.

...but I don't see any guidance for registries who support the referrers API.

Should registries garbage collect all manifests that point to the deleted subject or leave them?

Are we leaving this up to registry providers discretion?

I believe the justification for not allowing an index to have a subject was due to garbage collection implementations, so it feels incomplete not to have a recommendation for registries. Given that this is a new kind of relationship between artifacts, it would be great to standardize on the behavior rather than let it diverge again.

tianon commented 1 year ago

I'm not a maintainer, but as an interested party, my understanding is that the link between a manifest and the blobs it contains is a "hard" link and the link from things that refer to a manifest is "soft" -- the former is something I as the manifest author control, the latter might have been created by another party who will be surprised if it goes away (even if the thing it points to does).

As a concrete user example, there are some regulated industries I've worked with where the regulators themselves would be very interested in having the ability to keep a copy of all the SBOMs of the images, but don't want to store the images themselves for various reasons (storage, IP, etc).

vbatts commented 1 year ago

Like what Tianon said. My feeling is that this is a policy for the registry. Allow the user to configure. I might imagine that some regulation may want signatures/attestations on an object have to stick around for audit sake, even if the object is now on a denylist or deleted due to security or whatever. If they have to pay for the storage, then be up front. They may prefer the option to delete-when-dangling. They're decoupled.

brackendawson commented 1 year ago

How would the advice look if you wanted to suggest to a registry developer that they MAY delete untagged images but SHOULD keep untagged artifacts, when that artifact is an Image Index?

Image Manifests not referred to by a tag or an Image Index MAY be deleted by the administrator of a registry if desired, unless that manifest has a subject field, in which case the manifest SHOULD be retained, even if the subject referred to does not exist in the registry.

It's not pretty. I can implement it. ~But it's not backwards compatible with OCI 1.0 registries, they will just ignore the subject field and delete the manifest anyway. This difference in the way otherwise portable manifests get treated could be surprising for users of our software.~ Or would the referrers tag schema save them?

~I think that if a registry administrator choses to run or enable a process to delete untagged manifests then that should be exactly what they get, and they need to make it clear to users of their registry that that is what will happen. My registry does not do this.~

Edit: Actually the references tag schema would totally cause dangling artifacts in OCI 1.0 registries to be retained forever. For the best backwards compatibility we really should recommend OCI 1.1 registries not delete manifests with subjects when deleting untagged manifests.

mikebrow commented 1 year ago

put another way..

Should client users push artifacts before or after pushing the container images said artifacts refer to?

If we support both there are timing windows associative with the OP question for garbage collection.. and again that would be a registry decision point. I can see pros/cons either way (strict/loose or strict/loose with a time window), and I can see registries choosing one or the other by customer type and use desired use cases. Maybe someone want's to keep an artifact around for a currently deleted image, where the artifact certifies a prior version of the image still in use on a cloud account in some remote location :-)

mikebrow commented 1 year ago

At some point in time.. we are going to need to cover gc policies, IMO.