lxc / lxc-ci

LXC continuous integration and build scripts
https://jenkins.linuxcontainers.org
Apache License 2.0
265 stars 136 forks source link

A better solution to the Void Linux shutdown problem #408

Open amak79 opened 3 years ago

amak79 commented 3 years ago

PR https://github.com/lxc/lxc-ci/pull/183 attempts to fix the Void Linux shutdown problem, but any changes to /etc/runit/1 won't survive an update to the runit-void package.

A better solution is the change the permissions of /etc/runit/stopit in /etc/rc.local, which is sourced during runit stage 2 and can be used to specify configuration to be done prior to login.

I've added the following line to /etc/rc.local to set the permissions for /etc/runit/stopit.

chmod 100 /run/runit/stopit
dontlaugh commented 1 year ago

Perhaps this can be handled by configuring lxc stop to send a different signal.

Please see this solution: https://github.com/lxc/lxd/issues/5592#issuecomment-1339984790

I tested it. It stopped the container right away. I need to confirm that it was truly a graceful shutdown, but given that the halt signal is how graceful shutdowns should be requested, it seems promising.

amak79 commented 1 year ago

From the runit-init man page:

       init 6 tells the Unix process no 1 to shutdown and reboot the system.
              To signal runit(8) the system reboot request, runit-init sets
              the execute by owner permission of the files /etc/runit/reboot
              and /etc/runit/stopit (chmod 100). Then a CONT signal is sent to
              runit(8).

Sending SIGCONT requires changing the permissions for /etc/runit/stopit to 100. I have lxc.signal.halt=SIGCONT plus chmod 100 /run/runit/stopit in /etc/rc.local and it has worked for me.

stgraber commented 1 year ago

So a proper fix would be an init script (not rc.local) which if a container is detected goes on to chmod /run/runit/stopit as 100?

Can someone familiar with voidlinux send a PR that does that? (needs an update to voidlinux.yaml)

sbromberger commented 10 months ago

In newer images, /etc/runit/1 has been changed to include

install -m100 /dev/null /run/runit/stopit

In older images, it is

install -m000 /dev/null /run/runit/stopit

I don't know when that change was made but the current official void repo still has it as 000: https://github.com/void-linux/void-runit/blob/32893ea380f756534c919d2b4d2c47cd9242eaa0/1#L27

sbromberger commented 10 months ago

Here's the fundamental problem with the yaml right now: The modification in https://github.com/lxc/lxc-ci/blob/3f99602ad396722461d974f7e32a4e3374d1e005/images/voidlinux.yaml#L234 will be overwritten whenever the runit-void package gets updated. The recommendation from the void folks is to put this in /etc/rc.local.

sbromberger commented 9 months ago

I have proposed https://github.com/void-linux/void-runit/pull/113 as a step towards a solution here. Once that PR is merged, we will need to make the necessary changes to the void YAML here to create a local core service that will change the permissions on /run/runit/stopit.

duskmoss commented 2 months ago

Adding a note for voidlinux/musl

looks like adding chmod 100 /run/runit/stopit and incus config set test raw.lxc="lxc.signal.halt=SIGCONT" works as described for voidlinux images, SIGINT also semi works as sbromberger described in void-runit/pull/50

however neither work at all voidlinux/musl images, stop requests just hang. I might investigate farther later, but thought I should note for anyone else trying that here.

dontlaugh commented 2 months ago

New related PR upstream https://github.com/void-linux/void-runit/pull/122

Vaelatern commented 1 month ago

Thank you @sbromberger the work is merged.