awakesecurity / nix-deploy

Deploy software or an entire NixOS system configuration to another NixOS system
Other
172 stars 10 forks source link

Erroneous error message when a service fails to start #12

Closed chris-martin closed 6 years ago

chris-martin commented 6 years ago

If I deploy and a service fails to start, nix-deploy gives me an error message that indicates that sudo failed on the target machine.

Jan 25 03:45:26 tc-webserver1 systemd[1]: tc-app-server.service: Unit entered failed state.
Jan 25 03:45:26 tc-webserver1 systemd[1]: tc-app-server.service: Failed with result 'exit-code'.
nix-deploy: user error ([x] Failed to switch typeclasses.com to /nix/store/nx3f93ll1dksdpypqik1rbrp013vclm2-nixos-system-tc-webserver1-18.03pre-git

    You need `sudo` privileges on the target machine

Original error was:

ProcFailed {procCommand = "ssh", procArguments = ["typeclasses.com","sudo","/nix/var/nix/profiles/system/bin/switch-to-configuration","switch"], procExitCode = ExitFailure 4}
)

In fact the deploy did complete, and the errors are unrelated to sudoing.

ixmatus commented 6 years ago

Yes, I understand what's going on here. ~We have fixed this in a few other tools too so when I have a minute I'll get a PR out.~

ixmatus commented 6 years ago

If the status code returned by switch-to-configuration let's us discriminate between success and failure then we could provide a more useful and specific message to the user (or just treat it as success even though a service failed to start or restart.)

Based on looking at the switch-to-configuration script, I think it will return an exit code of 4 if any services fail: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/activation/switch-to-configuration.pl#L461