hercules-ci / arion

Run docker-compose with help from Nix/NixOS
Apache License 2.0
622 stars 48 forks source link

Services don't cleanly stop on system shutdown #250

Open lilyhahn opened 2 months ago

lilyhahn commented 2 months ago

Hi,

I have a docker-compose set up as a system service. It's part of a module that is imported into my main configuration.nix.

The relevant configuration looks like this:

virtualisation.arion = {
                backend = "docker";
                projects.myservice = {
                        serviceName = "myservice";
                        settings.services."myservice".service = {
                                image = "myservice:latest";
                                restart = "unless-stopped";
                        };
                };
};

When I shutdown the machine with shutdown -h now, systemd appears to successfully stop the service, but then the final stage of shutdown hangs, logging that it's waiting for node (the main process inside the container) to exit.

If I run systemctl stop myservice, then shutdown, everything stops cleanly.

I've found this issue in nix containers which appears similar (https://github.com/NixOS/nixpkgs/issues/109695), but the suggested workaround is adding a script to stop the services, and I do see systemd running the stop jobs when I shutdown - for some reason they have different results when run manually before shutdown.

Any suggestions for how to solve this issue would be appreciated!

roberth commented 2 months ago

Hi @lilyhahn, I've done a bit of testing in #251, although I didn't get around to testing with restart = "unless-stopped", or making the container try to stick around. That PR also has an alternate implementation of the NixOS service, so maybe that's something you could try out, to see if arion down behaves better than interrupting the un-detached arion up it used to have?

lilyhahn commented 2 months ago

Hi @roberth ! thank you for taking a look!

I tried it with restart = "unless-stopped" removed, in case that was the issue.

Apologies for this probably dumb question, I am new to NixOS.

I have Arion imported in my configuration like this:

environment.systemPackages = [
            pkgs.arion
    ];

    imports = [ ((builtins.fetchTarball "https://github.com/hercules-ci/arion/archive/7f6c58f.tar.gz") + "/nixos-module.nix") ];

If I change the commit in the imports, is that enough to install from the PR? Or do I need to change something in the environment.systemPackages as well?

I updated my configuration to the hash above for the nixes-module import, and I got the same behavior.

roberth commented 2 months ago

The general idea is alright, but it looks like you've picked the commit that only added tests. If you haven't tried it, try with 0437b5f9a455b5b02c433b818289011d7ed3d3ef? environment.systemPackages should be fine as is, because this is all about the NixOS integration and not so much the arion program or compose file generation.

I don't think I've produced a representative test case, so I wouldn't be surprised if the linked commit doesn't help. Maybe the containers aren't shutting down correctly because they're blocking on something that's not really available anymore during shutdown? I guess you might find that by strace logging to a persistent file.