Repository name limited to 255 characters

ahachete commented 1 month ago

Description

When pulling an image with a long name, I get the following error:

repository name must not be more than 255 characters

To my knowledge, this length limitation is nowhere present in the OCI Distribution Spec, where the regex for the name is defined as [a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*(\/[a-z0-9]+((\.|_|__|-+)[a-z0-9]+)*)*, which doesn't impose any limit.

Certainly, the spec also specifies that:

Implementers note: Many clients impose a limit of 255 characters on the length of the concatenation of the registry hostname (and optional port), /, and value. If the registry name is registry.example.org:5000, those clients would be limited to a of 229 characters (255 minus 25 for the registry hostname and port and minus 1 for a / separator). For compatibility with those clients, registries should avoid values of that would cause this limit to be exceeded.

Which seems to be Docker's case.

But what's the rationale for this limitation? I'm hitting this limitation myself, and I'm sure I'm not alone (e.g. anyone using Nixery or similar dynamic image generation tools with long "paths" may be in the same position).

Would there be any reasons not to raise this limit as of today?

Reproduce

docker pull example.org/a/very/long/name/totaling/more/than/255/chars

Expected behavior

Image should be pulled without error.

docker version

$ docker version
Client: Docker Engine - Community
 Version:           27.1.1
 API version:       1.46
 Go version:        go1.21.12
 Git commit:        6312585
 Built:             Tue Jul 23 19:56:56 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.1.1
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.12
  Git commit:       cc13f95
  Built:            Tue Jul 23 19:56:56 2024
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.7.19
  GitCommit:        2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc:
  Version:          1.7.19
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    27.1.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.16.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Additional Info

No response

thaJeztah commented 1 month ago

this length limitation is nowhere present in the OCI Distribution Spec

The length is defined in the distribution/reference module, which defines the standard as used by docker, and most other tools https://github.com/distribution/reference/blob/8c942b0459dfdcc5b6685581dd0a5a470f615bff/reference.go#L61-L62

But what's the rationale for this limitation?

Interoperability; these formats have been used since the inception of image registries, and have been adopted by the container ecosystem. While changing these is possible, it means that such references may not work with most existing tools and implementations; registries may have restrictions in place, but also existing client implementations (docker, kubernetes, containerd, and most (if not all) other tools).

ahachete commented 1 month ago

Interoperability; these formats have been used since the inception of image registries, and have been adopted by the container ecosystem. While changing these is possible, it means that such references may not work with most existing tools and implementations; registries may have restrictions in place, but also existing client implementations (docker, kubernetes, containerd, and most (if not all) other tools).

In this case, may I say "backwards compatibility" instead of "interoperability"? Because on the other side of the coin are usages that clearly need this limit to be bumped, where limiting it actually harms interoperability.

At some point, new use cases and advancements require to make decisions to allow precisely for interoperability with new use cases. I believe this is such a case, as a balance needs to be made between possibly breaking old cases vs breaking new cases.

I believe that allowing Docker client to use longer repository names would break little to no "backwards compatibility" use cases, since all existing references, due to this limitation, do not break it today. So only new use cases would generate such repository names; and those would only work with certain combinations of clients and registries, so users will be clearly aware of them.

For the record, ctr does not impose this limitation. I'm happy to test other implementations if that would give additional valuable insight.

jjmaestro commented 1 month ago

Just to add my two cents here, as far as I understand, the reference @thaJeztah links is the spec for the registry, correct?

If so, what's the problem to lift the hard limit in the client? The limitations are imposed by what the registries store, so docker client should be able to support registries that don't have such limitations, right?

Just to clarify, since the 255 limit is a lower boundary, all current existing registries would work with a docker CLI tool that doesn't impose such limit and lets users request e.g. images of "unlimited length" or "absurd lengths".

If the registry that they are querying supports such lengths (e.g. if they are not using the distribution/reference Go library and/or they are not following that spec) it will work. It won't break the current state of the world, it will only work with e.g. "newer registries" that allow other lengths.

thaJeztah commented 1 month ago

In this case, may I say "backwards compatibility" instead of "interoperability"?

Yes, correct. Although "backward compatibility" in this context may mean; any existing use of docker and other tools following these formats, i.e., docker push using changed semantics meaning; cannot be pulled / used on anything other than the latest version, hence my choice of "interoperability".

For the record, ctr does not impose this limitation. I'm happy to test other implementations if that would give additional valuable insight.

ctr is a debugging tool, and has very little restrictions for that matter (e.g., see https://github.com/containerd/containerd/issues/7986)

jjmaestro commented 1 month ago

@thaJeztah but how is changing the docker CLI to allow longer names breaking backward compatibility? It will work with registries (old or not) and "old" images that conform to the 255 standard. That is, it will be backwards compatible, right?

It would not be if the change would require restricting the names to shorter names. Then sure, it would break backwards compatibility.

ahachete commented 1 month ago

Yes, correct. Although "backward compatibility" in this context may mean; any existing use of docker and other tools following these formats, i.e., docker push using changed semantics meaning; cannot be pulled / used on anything other than the latest version, hence my choice of "interoperability".

Fair enough.

But then back to my OP: how do you see supporting use cases where people need to use longer repository names? Nixery is the best example I can think of now, and I'm sure others exist and will follow. This is only a matter of time.

Without changing anything, this hard prevents such use cases, which IMO seems more harmful than the backwards compatibility / interoperability that you rightfully mention, isn't it? How can these new use cases be supported by Docker?

thaJeztah commented 1 month ago

as far as I understand, the reference @thaJeztah links is the spec for the registry, correct?

It's in the distribution org, as it was part of the distribution "client" code, which was the canonical implementation of a client for registries. There's now different implementations for that (client) part, the definition of the format lives in that repository and is either used directly in other code-bases (e.g. kubernetes), or implemented based on the definitions in that repository (e.g. jkube has an implementation in Java).

The CLI repository (where this ticket is opened) is only the tip of the iceberg there; some parsing may happen in the CLI, but most will happen on the daemon side (or further down, such as in code consumed from containerd)

docker / cli