NixOS / nixops

NixOps is a tool for deploying to NixOS machines in a network or cloud.
https://nixos.org/nixops
GNU Lesser General Public License v3.0
1.84k stars 363 forks source link

container backend: start on boot #655

Open mguentner opened 7 years ago

mguentner commented 7 years ago

The container backend does not expose the option to start the container when booting.

In NixOS this is

containers.foobar.autoStart = true;
Ma27 commented 5 years ago

The problem here is that the container backend uses the imperative nixos-container script rather than declarative containers.

I tried to patch the backend accordingly to implement this feature, the current diff in my local nixops checkout looks like this:

diff --git a/nix/container.nix b/nix/container.nix
index b6de2cc..5b907f1 100644
--- a/nix/container.nix
+++ b/nix/container.nix
@@ -27,6 +27,8 @@ in
       '';
     };

+    deployment.container.autoStart = mkEnableOption "automatic container startup";
+
   };

   config = mkIf (config.deployment.targetEnv == "container") {
diff --git a/nixops/backends/container.py b/nixops/backends/container.py
index f54eaa4..ce9e177 100644
--- a/nixops/backends/container.py
+++ b/nixops/backends/container.py
@@ -16,6 +16,7 @@ class ContainerDefinition(MachineDefinition):
         MachineDefinition.__init__(self, xml, config)
         x = xml.find("attrs/attr[@name='container']/attrs")
         assert x is not None
+        self.auto_start = x.find("attr[@name='autoStart']/bool").get("value") == 'true'
         self.host = x.find("attr[@name='host']/string").get("value")

 class ContainerState(MachineState):
@@ -27,6 +28,7 @@ class ContainerState(MachineState):

     state = nixops.util.attr_property("state", MachineState.MISSING, int)  # override
     private_ipv4 = nixops.util.attr_property("privateIpv4", None)
+    auto_start = nixops.util.attr_property("container.autoStart", False, bool)
     host = nixops.util.attr_property("container.host", None)
     client_private_key = nixops.util.attr_property("container.clientPrivateKey", None)
     client_public_key = nixops.util.attr_property("container.clientPublicKey", None)
@@ -144,10 +146,17 @@ class ContainerState(MachineState):
             self.log("creating container...")
             self.host = defn.host
             self.copy_closure_to(path)
+
+            auto_start = "--auto-start %s" % ("1" if defn.auto_start else "0")
             self.vm_id = self.host_ssh.run_command(
-                "nixos-container create {0} --ensure-unique-name --system-path '{1}'"
-                .format(self.name[:7], path), capture_stdout=True).rstrip()
+                "nixos-container create {0} --ensure-unique-name {1} --system-path '{2}'"
+                .format(self.name[:7], auto_start, path), capture_stdout=True).rstrip()
             self.state = self.STOPPED
+        else:
+            self.host_ssh.run_command("nixos-container update {0} --config auto-start {1}".format(
+                self.name[:7],
+                "1" if self.auto_start else "0"
+            ))

         if self.state == self.STOPPED:
             self.host_ssh.run_command("nixos-container start {0}".format(self.vm_id)

Unfortunately I'm afraid that the --auto-start feature implemented in nixos-container.pl isn't implemented properly, a local container doesn't start on boot when created with the following command locally (tested with nixpkgs at 7190a0b696b1813a46bf5d2eaac6104700ff248d which is a recent 19.03):

nixos-container create test --auto-start --config "" 

I figured that the easiest workaround right now is to simply deploy a systemd unit to all hosts with containers that are deployed using the nixops's container backend which starts all of them automatically:

{ pkgs, ... }:

{
  systemd.services."start-all-containers" = {
    wantedBy = [ "network.target" ];
    description = "Start all NixOS containers on this host";
    path = [ pkgs.nixos-container pkgs.findutils ];

    script = ''
      nixos-container list | xargs -I % nixos-container start %
    '';

    serviceConfig = {
      Type = "oneshot";
      RemainAfterExit = true;
    };
  };
}

I talked to @fpletz yesterday about the problem, he recommended to refactor the code to actually use nspawn units with systemd as this would also mean proper machinectl support, currently only active containers are listed there. But I'm not sure yet if we want to actually use this in the containers module of NixOS as well which will cause issues like the possibility of incompatibilities between a recent nixops with an older nixpkgs.

Another option would be to use the containers module for declarative containers, but I'm not sure about the amount of work here as well. However this would mean that we can also use several features that don't seem to work with imperative containers right now such as IPv6 support.

I'm not sure if I can reserve time to implement one of these options here, until then I recommend anyone with similar issues to use a systemd unit like the one I posted above.