just-containers / s6-overlay

s6 overlay for containers (includes execline, s6-linux-utils & a custom init)
Other
3.7k stars 208 forks source link

How can I use another PID1 solution to call s6-svscan ? #522

Closed endersonmaia closed 1 year ago

endersonmaia commented 1 year ago

My container already works with s6-overlay, but I'm testing the deployment at fly.io, and their VM solution will get my Docker image and inject their own /init.

I need a way to define the ENTRYPOINT to another command, like s6-svscan so that it can be called.

I tried renaming /init to /s6-init and defining ENTRYPOINT=[/s6-init], but it failed, probably cause it'll only run if its pid=1.

Is there a way to disable this pid=1 check?

Or is there a way to call another s6 tool that will be managed by another pid=1 supervisor?

I found this https://skarnet.org/software/s6/s6-svscan-not-1.html but s6-overlay has no s6-scanboot executable, and trying to call s6-svscan /etc/s6-overlay/s6-rc.d didn't work either.

skarnet commented 1 year ago

You can indeed run an s6 supervision tree where s6-svscan is not process 1, but that is not s6-overlay. s6-overlay, by design, runs s6-svscan as pid 1, and is organized around that for container startup and shutdown management; pid!=1 is not and cannot be supported. Sorry.

This should not be an issue, because container providers should let you run the init system of your choice. If fly.io, or any other solution, imposes their own init process on you, then they're not providing you with full container functionality.

At the very least, if they need to own pid 1, they should provide equivalent functionality to s6-overlay. If they aren't, what are they doing with the money you give them?

endersonmaia commented 1 year ago

Thanks for the fast reply.

They use firecracker to run a container isolated inside a VM with it own kernel, so it's not 100% container.

They probably leverage the container image to build a rootfs and start the VM.

I'll contact them and ask for a solution

Em qui., 6 de abr. de 2023 18:33, Laurent Bercot @.***> escreveu:

You can indeed run an s6 supervision tree where s6-svscan is not process 1, but that is not s6-overlay. s6-overlay, by design, runs s6-svscan as pid 1, and is organized around that for container startup and shutdown management; pid!=1 is not and cannot be supported. Sorry.

This should not be an issue, because container providers should let you run the init system of your choice. If fly.io, or any other solution, imposes their own init process on you, then they're not providing you with full container functionality.

At the very least, if they need to own pid 1, they should provide equivalent functionality to s6-overlay. If they aren't, what are they doing with the money you give them?

— Reply to this email directly, view it on GitHub https://github.com/just-containers/s6-overlay/issues/522#issuecomment-1499654462, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAC4RGBHWEZGWXKYBWNWLW74ZCNANCNFSM6AAAAAAWV3WOFY . You are receiving this because you authored the thread.Message ID: @.***>

endersonmaia commented 1 year ago

I'm using this workaround,

It's working for now.

skarnet commented 1 year ago

Glad you could make it work, even if it's very much a hack (you're basically creating a container inside a VM made from a container image).

The real issue here is that you're trying to run container software on something that is not a container. Fly.io uses Docker images to create their VMs, but it cannot be perfect, because the way you run a VM is a bit different from the way you run a container - and that's why they need to provide their own pid 1, because they have to do more stuff at boot time than an init made for containers would let them do.

Unfortunately there's no clean solution here, apart from running an s6 supervision tree without the s6-overlay layer.

endersonmaia commented 1 year ago

(you're basically creating a container inside a VM made from a container image).

When you say like that, it looks more hacky, but a container is just a process inside a namespace, right? :D

Unfortunately there's no clean solution here, apart from running an s6 supervision tree without the s6-overlay layer.

I went that path, but prefered to use s6-overlay.

Have a lot to learn about it still, and I have use for a lot of its features like onshots, dependencies, logs ....

Thanks for all of this!

endersonmaia commented 1 year ago

/label question solved

endersonmaia commented 4 months ago

For the record, I'm using a distroless container image without a shell but just execlineb.

So I'd like to convert this shell script to an execline script.

#!/bin/sh
# from: https://community.fly.io/t/is-it-possible-to-use-my-own-init/12082/4
# run /init with PID 1, creating a new PID namespace if necessary
if [ "$$" -eq 1 ]; then
    # we already have PID 1
    exec /init "$@"
else
    # create a new PID namespace
    exec unshare --pid sh -c '
        # set up /proc and start the real init in the background
        unshare --mount-proc /init "$@" &
        child="$!"
        # forward signals to the real init
        trap "kill -INT \$child" INT
        trap "kill -TERM \$child" TERM
        # wait until the real init exits
        # ("wait" returns early on signals; "kill -0" checks if the process exists)
        until wait "$child" || ! kill -0 "$child" 2>/dev/null; do :; done
    ' sh "$@"
fi

I know execline has exec, kill, getpid,wait, trap, if*

I could wrap this until login inside a loopwhilex.

The missing part would be unshare, which I'll need to provide its binary and the $! to get the child's pid.

Any help is appreciated here.

PS: I'm using the same issue 'cause I think it's easier to keep the context, instead of opening another issue.

PS1: Also, not using the mailing list, since I think it's easier for others (my future self included) to search/find this here.

skarnet commented 4 months ago

unshare comes from the util-linux package, which you could add to your image.

As for the script, you could try something like this (untested, but it should be a good starting point):

#!/command/execlineb -S0

ifelse { getpid -E PID eltest ${PID} -eq 1 } { /init $@ }
trap -x { }
unshare --mount-proc
/init $@

Most of the logic in the shell script you quoted is included in execline's trap, so using it should simplify things.