Open MayCXC opened 1 month ago
For programs that support socket activation, this feature would also provide the same benefits of https://github.com/moby/moby/issues/7536
The implementation of this feature will require/satisfy https://github.com/moby/moby/issues/43935 as well
the runc --preserve-fds
option is also relevant to this: https://github.com/opencontainers/runc/blob/main/docs/terminals.md#other-file-descriptors
it would also be useful if compose could specify a network to create a listener for, other than the host:
networks:
wwwnet:
services:
www:
networks:
wwwnet:
sockets:
- 0.0.0.0:80:4/tcp
compose and the cli could also use a syntax 0.0.0.0:80:SOCKET_FD/tcp
to indicate that the listener can be provided as any available fd, and the environment variable SOCKET_FD
will be set to its number.
Description
An old-and-now-new-again technique to scale and update daemons that listen on sockets of all kinds, is to rely on the daemon executor to bind and listen on sockets and pass on their file descriptors, which then allows the daemon to be stopped and restarted while its sockets remain bound and listening. This is the function of tools like inetd, launchd, systemd-socket-activation, s6-fdholderd, etc. podman supports this functionality for containers with systemd: https://github.com/containers/podman/blob/main/docs/tutorials/socket_activation.md#socket-activation-of-containers
any container runtime daemon can just as easily support this functionality on its own, and the sockets themselves can comfortably be made part of an image configuration. Here is an example of instructions that could declare such file descriptors in a Dockerfile:
This documents that the container expects to receive file descriptors 3 and 8 from the host, similar to the
EXPOSE
instruction for tcp/udp ports, and that they should be sockets that listen on the tcp and unix networks. Here is a corresponding service level element in a compose.yml:here the daemon is instructed to open bind and listen on these sockets in the host, and pass them to the www service container as fds 3 and 8.
A program that supports socket activation like traefik can be executed seamlessly in this manner, and even while its container restarts or updates, it appears to be listening on both of these sockets. A savvy daemon can scale it to zero instances, wait for either socket to receive a connection, and then activate it again. This has an added benefit in compose projects that certain depends_on and healthcheck elements can become unnecessary, because the host can listen on every socket before the services that use them start. then services can connect to these listeners as early as they want, with
host.docker.internal
for network sockets, or a bind mount for named sockets, and simply wait for their connections to unblock.I believe that declarative socket file descriptors carry the same advantages as bind mounts and bridge networks for containers that listen on sockets. They can be configured via CLI as well like so:
docker run -s 0.0.0.0:80:3/tcp -s /run/www.sock:8/unix traefik ...
In other cases, CLI users may want to pass extra fds to a container without binding them on the host. This case can be documented with a similar instruction:
and configured via CLI as well:
docker run -f 4 -f 5 ...
to receive fds 4 and 5 from the parent process without binding them, ordocker run -F ...
to receive all the declared fds in this way. It could also be convenient to map fds with the CLI:docker run -f 6:4 -f 9:5 ...
passes fd 6 from the host to fd 4 in the container, and does the same for 9 to 5.This enables any docker host to enjoy the seamless restarts and reduced initialization complexity of socket activation, without relying on a particular init system. I think it follows in the spirit of https://github.com/moby/moby/issues/2658, but offers seamless restarts for containers and not just the daemon. I'd love to know what others think of this feature.