containerd / nerdctl

contaiNERD CTL - Docker-compatible CLI for containerd, with support for Compose, Rootless, eStargz, OCIcrypt, IPFS, ...
Apache License 2.0
8.06k stars 597 forks source link

Add --preserve-fds N to nerdctl run #3534

Open MayCXC opened 1 week ago

MayCXC commented 1 week ago

What is the problem you're trying to solve

container runtimes support passing additional file descriptors from the parent process into containers, which has at least two nice use cases:

which enhances security and allows for seamless upgrades.

Describe the solution you'd like

nerdctl run takes a --preserve-fds N argument, that specifies how many "extra" fds to pass to containers after stdin/stdout/stderr. podman also supports systemd's socket activation $LISTEN_FDS environment variable, which I do not recommend adding separate support for to nerdctl. a simple shell script can read these variables and supply the appropriate --preserve-fds argument to nerdctl if desired.

Additional context

support is already in runc, and podman run has this argument: https://github.com/containers/podman/pull/6625

I am happy to contribute this feature if there is interest :)

AkihiroSuda commented 1 week ago

I am happy to contribute this feature if there is interest :)

Thanks, SGTM

The CLI syntax should follow Podman

eriksjolund commented 1 day ago

This would be cool. To implement this you would need use SCM_RIGHTS, no?

Quote from man 7 unix

       SCM_RIGHTS
              Send or receive a set of open file descriptors from
              another process.  The data portion contains an integer
              array of the file descriptors.

https://man7.org/linux/man-pages/man7/unix.7.html

So the architecture would look something like this in the case of systemd socket activation?

nerdctl (possible architecture)

stateDiagram-v2
    [*] --> systemd: first client connects
    state "shell script wrapper" as s5
    systemd --> s5: socket inherited via fork/exec
    s5 --> nerdctl: socket inherited via fork/exec
    state "OCI runtime" as s2
    nerdctl --> containerd: socket sent with SCM_RIGHTS
    containerd --> s2: socket inherited via fork/exec
    s2 --> container: socket inherited via exec

podman (current architecture)

stateDiagram-v2
    [*] --> systemd: first client connects
    systemd --> podman: socket inherited via fork/exec
    state "OCI runtime" as s2
    podman --> conmon: socket inherited via double fork/exec
    conmon --> s2: socket inherited via fork/exec
    s2 --> container: socket inherited via exec

Diagram from https://github.com/containers/podman/blob/main/docs/tutorials/socket_activation.md#socket-activation-of-containers

MayCXC commented 1 day ago

This would be cool. To implement this you would need use SCM_RIGHTS, no?

Quote from man 7 unix

       SCM_RIGHTS
              Send or receive a set of open file descriptors from
              another process.  The data portion contains an integer
              array of the file descriptors.

https://man7.org/linux/man-pages/man7/unix.7.html

So the architecture would look something like this in the case of systemd socket activation?

nerdctl (possible architecture)

podman (current architecture)

Diagram from https://github.com/containers/podman/blob/main/docs/tutorials/socket_activation.md#socket-activation-of-containers

I don't believe so, SCM_RIGHTS is for transferring FDs from one PID to another via a UDS. I do not think that is what runc does, it would look more like --take-fd 6:3 /path/to/unix/socket . --preserve-fds just passes fds already open in the nerdctl process to runc, so for example in a shell script you would use exec nerdctl --preserve-fds x to pass x fds that you that had already opened with exec or similar to runc. for nerdctl, I think can be done with just Cmd.ExtraFiles .

if you want to pass fds between processes with a SCM_RIGHTS socket, you can use a do-one-thing tool like s6-fdholderd , but for systemd socket units in your architecture diagram, you would just need nerdctl --preserve-fds $LISTEN_FDS in the service unit.

eriksjolund commented 18 hours ago

I don't quite follow how it would work without passing the socket file descriptor from nerdctl to containerd. runc is executed by containerd (not by nerdctl). Maybe I haven't understood the overall architecture of nerdctl, containerd, runc.

MayCXC commented 18 hours ago

I don't quite follow how it would work without passing the socket file descriptor from nerdctl to containerd. runc is executed by containerd (not by nerdctl). Maybe I haven't understood the overall architecture of nerdctl, containerd, runc.

oh, you're 100% right, I should have taken a closer look at that mermaid. SCM_RIGHTS is definitely the way to pass fds from nerdctl to containerd.