opencontainers / distribution-spec

OCI Distribution Specification
https://opencontainers.org
Apache License 2.0
814 stars 202 forks source link

Clarification in spec needed: Deletion by tag vs. deleting tags. #551

Closed NiklasBeierl closed 4 days ago

NiklasBeierl commented 2 weeks ago

Hey folks,

I am really sorry to poke into that sore spot again, but I believe the spec text could use a clarification about the semantics of deleting tags. As I see from several other issues ( #102, #378, #114) the in's and out's of this are severely contested in peoples heads and real-world implementations. I have been trying to piece together the current state of affairs from all these discussions for about an hour now: My head hurts and I still not 100% sure of whats going on.

I understand the spec-maintainers desire to avoid overreaching and I see several good arguments for different semantics. I am not asking about changing the current semantics of the spec, I am just asking for clarification to be added to the spec, because I think the current wording does not fully address the following question:

Does deleting by tag delete the referenced manifest, or only the tag (reference) itself, or is that an implementation-detail?

Let us picture a manifest with digest 123 and tags A and B.

As far as I understand, there are / where two camps:

1) Deleting tag A deletes the manifest itself. So neither 123, A or B are available after deletion.

2) Deleting tag A deletes the tag, but not the referenced manifest. So A is gone, but 123 and B are still there.

As of today, 2024-08-29, the spec states:

#### Content Management

Content management refers to the deletion of blobs, tags, and manifests.
Registries MAY implement deletion or they MAY disable it.
Similarly, a registry MAY implement tag deletion, while others MAY allow deletion only by manifest.

##### Deleting tags

`<name>` is the namespace of the repository, and `<tag>` is the name of the tag to be deleted.
Upon success, the registry MUST respond with a `202 Accepted` code.
If tag deletion is disabled, the registry MUST respond with either a `400 Bad Request` or a `405 Method Not Allowed`.

To delete a tag, perform a `DELETE` request to a path in the following format: `/v2/<name>/manifests/<tag>` <sup>[end-9](#endpoints)</sup>

##### Deleting Manifests
To delete a manifest, perform a DELETE request to a path in the following format: /v2/<name>/manifests/<digest> [end-9](https://github.com/opencontainers/distribution-spec/blob/11b8e3fba7d2d7329513d0cff53058243c334858/spec.md#endpoints)

<name> is the namespace of the repository, and <digest> is the digest of the manifest to be deleted. Upon success, the registry MUST respond with a 202 Accepted code. If the repository does not exist, the response MUST return 404 Not Found.
...

Given that Deleting Manifests specifically uses a <digest>, I am inclined to believe that option 2) from the above is true. But I feel like someone reading the spec shouldn't have to guess.

I suggest adding exactly one of these to the Deleting tags section:

a) Deleting by tag deletes the underlying manifest and might thus also delete other tags referencing the same manifest.

b) Deleting a tag deletes the tag itself, not the underlying manifest. Note that implementations might garbage-collect manifestes without any tags.

c) Whether deleting by tag also deletes the underlying manifest is up to the implementation.

tianon commented 2 weeks ago

I'm not a distribution spec maintainer, but for what it's worth, "b" is the behavior of the original/reference implementation (which IMO is the behavior most supported by the existing spec wording as well).

sudo-bmitch commented 2 weeks ago

a) Deleting by tag deletes the underlying manifest and might thus also delete other tags referencing the same manifest.

I hope there aren't any registries out there that delete other tags.

b) Deleting a tag deletes the tag itself, not the underlying manifest. Note that implementations might garbage-collect manifestes without any tags.

This is close to my understanding of most registries. But saying "not the underlying manifest" implies that all registries would preserve the manifest, and I would not be surprised if some exist that immediately delete manifests as soon as the last tag is removed. That wouldn't be a full GC (you still need grace time to push a multi-platform collection of manifests), but a simple ref-count that destroys content as soon as the count goes from 1 to 0.

c) Whether deleting by tag also deletes the underlying manifest is up to the implementation.

It would be unexpected to me if an implementation destroyed the underlying manifest while other tags still reference it.

GC and content retention policies continue to be an implementation detail. Some registries retain content indefinitely, others are designed to be ephemeral destroying even tagged content after a short time, and organizations impose legal requirements on this to either preserve or ensure content is destroyed according to their company policies.

Looking over the spec, I think the clarification should be "what is a tag". We currently have the definition:

Tag: a custom, human-readable manifest identifier

I think it would be helpful to clarify that a tag is a pointer to a manifest and a manifest is stored by digest. Then the tag delete API would indicate that it is explicitly deleting the tag. While the manifest delete API is deleting both the manifest and any tags pointing to that manifest.

This intentionally avoids the question of whether implementations may or should delete the underlying manifest because I think it depends on both the scenario (particularly whether there other tags pointing to the manifest) and the implementation.

NiklasBeierl commented 2 weeks ago

@sudo-bmitch

Looking over the spec, I think the clarification should be "what is a tag". We currently have the definition:

Tag: a custom, human-readable manifest identifier

Hmm, I think that this is good enough for a definition. The clarification is really needed in the Deleting tags section. What contributes to the ambiguity is that the same endpoint used for deleting manifests and tags:

DELETE /v2/<name>/manifests/<tag>      # The action in question
DELETE /v2/<name>/manifests/<digest>  # Deletes an actual manifest
                      ⬆️
             Both say "manifests"
sudo-bmitch commented 2 weeks ago

The clarification is really needed in the Deleting tags section.

My suggestion included a recommended change to that section.

NiklasBeierl commented 1 week ago

The clarification is really needed in the Deleting tags section.

My suggestion included a recommended change to that section.

Apologies.

I guess you are suggesting the change with this sentence:

[...] Then the tag delete API would indicate that it is explicitly deleting the tag. [...]

I originally read this as: "The tag delete API would indicate - by virtue of being a tag delete API - that it is explicitly deleting the tag.

The irony, getting caught up in subtleties as we are discussing subtleties. :grimacing:

rchincha commented 1 week ago

My expectation is simply visibility.

DELETE /v2/<>/manifests/tag

GET /v2/<>/tags/list <- tag is gone

Everything else is implementation dependent.