NixOS / nixops

NixOps is a tool for deploying to NixOS machines in a network or cloud.
https://nixos.org/nixops
GNU Lesser General Public License v3.0
1.84k stars 363 forks source link

switch-to-configuration fails on none backend #864

Open alanpearce opened 6 years ago

alanpearce commented 6 years ago

A trivial deployment (e.g. https://gist.github.com/alanpearce/ce50cee7d818efca98af9453d273d237) executes against a VirtualBox backend with no problems. With the none backend, however, it fails when updating the system configuration:

production> closures copied successfully
trivial> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 3: use: command not found
trivial> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 4: use: command not found
trivial> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 5: use: command not found
trivial> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 6: use: command not found
trivial> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 7: syntax error near unexpected token `('
trivial> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 7: `use Sys::Syslog qw(:standard :macros);'
trivial> error: Traceback (most recent call last):
  File "/nix/store/fpd9lw3pw0k4v8i68lhj3811lya58l0d-nixops-2017-05-22/lib/python2.7/site-packages/nixops/deployment.py", line 705, in worker
    raise Exception("unable to activate new configuration")
Exception: unable to activate new configuration

error: activation of 1 of 1 machines failed (namely on ‘trivial’)
cleverca22 commented 6 years ago

@alanpearce which nixpkgs? what is the contents of switch-to-configuration?

alanpearce commented 6 years ago

nixpkgs was 17.09. Also occurred when deploying from a controller running 17.09-small. Not sure if it's relevant, but I also have the unstable channel on the controller as both a nixos-unstable channel and pkgs.unstable. nixops was 1.5.2, but I also tested 2017-05-22 and nixops-1.6pre2276_9203440

switch-to-configuration from 1.6-pre

rbvermaa commented 6 years ago

@alanpearce Do you have any more information about the machine you are deploying to? What nixos/nixpkgs is it running?

alanpearce commented 6 years ago

Is the output from nix-info helpful enough?

system: "i686-linux", multi-user?: yes, version: nix-env (Nix) 1.11.16, channels(root): "nixos-17.09.3142.e02a9ba3670", nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs

rbvermaa commented 6 years ago

Yes, thanks.

Tomahna commented 6 years ago

I encounter the exact same issue when trying to deploy on an arm device. It seems that nixops does not respect nixpkgs.system as switch-to-configuration tries to use an x86-64 perl.

$ head -1 /nix/var/nix/profiles/system/bin/switch-to-configuration
#! /nix/store/vawc9a89l53mf05yq0k1910q7dakd99w-perl-5.24.3/bin/perl -I/nix/store/cqhfrfkjbp80c2wdpd6m2k1iq99rbjpd-perl-File-Slurp-9999.19/lib/perl5/site_perl -I/nix/store/5wyryds7qkrkbf8gvjlbhj0cjnff0nln-perl-Net-DBus-1.1.0/lib/perl5/site_perl -I/nix/store/2v0liqkqmpj0vdgcr4krhbjyffp3h11b-perl-XML-Parser-2.44/lib/perl5/site_perl -I/nix/store/6myrz05ya606q7q311a9pdyg9pf6jk6c-perl-XML-Twig-3.52/lib/perl5/site_perl

$ file /nix/store/vawc9a89l53mf05yq0k1910q7dakd99w-perl-5.24.3/bin/perl
/nix/store/vawc9a89l53mf05yq0k1910q7dakd99w-perl-5.24.3/bin/perl: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/83lrbvbmxrgv7iz49mgd42yvhi473xp6-glibc-2.27/lib/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, not stripped

$ readlink /run/current-system/sw/bin/perl
/nix/store/dj9lnrflbhllqpzsx9qr869flvnfc5c6-perl-5.24.3/bin/perl

$ file /nix/store/dj9lnrflbhllqpzsx9qr869flvnfc5c6-perl-5.24.3/bin/perl
/nix/store/dj9lnrflbhllqpzsx9qr869flvnfc5c6-perl-5.24.3/bin/perl: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /nix/store/mq8raw5vz908vhl0hz5wk5bmdxnn2skz-glibc-2.27/lib/ld-linux-armhf.so.3, for GNU/Linux 2.6.32, not stripped

Is there any workaround for this ?

kwohlfahrt commented 6 years ago

I have the same issue deploying to a Raspberry Pi (aarch64-linux) from x86_64-linux. Using NixOps 1.6 and the unstable channel.

grahamc commented 6 years ago

If you're deploying to machine of a different arch of the deploy host, you need to add (example where the target host is aarch64-linux):

nixpkgs.system = "aarch64-linux";

in the machine's config. This has worked fine for me.

kwohlfahrt commented 6 years ago

I have on my deploy host two files, deployment.nix:

pi = args@{config, pkgs, ...}: 
    (import ./pi.nix args) // {
      deployment.targetHost = "192.168.0.16";
    };
}

and pi.nix (this is almost identical to the Pi's current /etc/nixos/configuration.nix):

{ config, pkgs, ... }:

{
  nixpkgs.system = "aarch64-linux";
  system.stateVersion = "unstable";

  boot.loader = {
    grub.enable = false;
    generic-extlinux-compatible.enable = true;
  };

  boot.kernelPackages = pkgs.linuxPackages_latest;
  # Needed for the virtual console to work on the RPi 3, as the default of 16M doesn't seem to be enough.
  boot.kernelParams = ["cma=32M"];

  # services & account config omitted
}

I am deploying to a Raspberry Pi 3.

On the deploy host:

[kai@nixos:~/Documents/code/ops]$ nixops --version
NixOps 1.6
[kai@nixos:~/Documents/code/ops]$ nixops create -d kainet deployment.nix 
created deployment ‘6c147838-9dab-11e8-aedd-52540088bcd3’
6c147838-9dab-11e8-aedd-52540088bcd3
[kai@nixos:~/Documents/code/ops]$ nixops deploy

It proceeds to copy a whole bunch of closures. Choosing one at random that looks like it might contain some binaries:

pi> copying path '/nix/store/v7hg431d55q30gy7hqlpiji3jnvi8gs3-glibc-2.27' from 'https://cache.nixos.org'...

And then on the target machine:

[root@nixos:~]# file /nix/store/v7hg431d55q30gy7hqlpiji3jnvi8gs3-glibc-2.27/lib/ld-2.27.so 
/nix/store/v7hg431d55q30gy7hqlpiji3jnvi8gs3-glibc-2.27/lib/ld-2.27.so: ELF 64-bit LSB pie executable x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=e41202884336c7bfc0df19dc80fc8dfd248441c8, not stripped

I think the above should not show x86_64. For comparison:

[root@nixos:~]# file $(readlink $(which bash))
/nix/store/9rzy0xh46cj0rlgsc11y0yf604wb83n4-bash-interactive-4.4-p23/bin/bash: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /nix/store/kyar2xljg7fydfp3wirmxm52lyk7awhc-glibc-2.27/lib/ld-linux-aarch64.so.1, for GNU/Linux 2.6.32, not stripped

At the very end, I get the following error:

kainet> closures copied successfully
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 3: use: command not found
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 4: use: command not found
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 5: use: command not found
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 6: use: command not found
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 7: use: command not found
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 8: syntax error near unexpected token `('
pi> /nix/var/nix/profiles/system/bin/switch-to-configuration: line 8: `use Sys::Syslog qw(:standard :macros);'
pi> error: Traceback (most recent call last):
  File "/nix/store/wg4cfwjw8hq0kr4q0i8svy89xm4l1899-nixops-1.6/lib/python2.7/site-packages/nixops/deployment.py", line 731, in worker
    raise Exception("unable to activate new configuration")
Exception: unable to activate new configuration

error: activation of 1 of 1 machines failed (namely on ‘pi’)

So I investigate the file:

[root@nixos:~]# head -n 1 /nix/var/nix/profiles/system/bin/switch-to-configuration
#! /nix/store/9wd0qq24kqkn0jrz1kzh293k5p869im9-perl-5.24.4/bin/perl -I/nix/store/v6kxygk5i6cd8ij5r809nlkjx80rg36r-perl-File-Slurp-9999.19/lib/perl5/site_perl -I/nix/store/nz0d6k9g88my65cvdqsbxcdkfskrfvw3-perl-Net-DBus-1.1.0/lib/perl5/site_perl -I/nix/store/pvcad82jnvmr4a03gkg9mrxmz8g26q2v-perl-XML-Parser-2.44/lib/perl5/site_perl -I/nix/store/4whv943j55cfb9yqxc9irkhmhw45vmf4-perl-XML-Twig-3.52/lib/perl5/site_perl
[root@nixos:~]# file /nix/store/9wd0qq24kqkn0jrz1kzh293k5p869im9-perl-5.24.4/bin/perl
/nix/store/9wd0qq24kqkn0jrz1kzh293k5p869im9-perl-5.24.4/bin/perl: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/v7hg431d55q30gy7hqlpiji3jnvi8gs3-glibc-2.27/lib/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, not stripped

Again it is x86_64.

Is anything wrong with my configuration that I posted, or could you give a working example I can test?

grahamc commented 6 years ago

I'm not sure, but you probably want to avoid // when merging NixOS configuration sets, and instead use the NixOS module system's import feature:

pi = { ... }: 
    {
      deployment.targetHost = "192.168.0.16";
      imports = [ ./pi.nix ];
    };
}
kwohlfahrt commented 6 years ago

I get exactly the same error with that suggestion. Even pasting the config in instead of using any kind of import has the same result.

When I delete and recreate the deployment, it includes the line pi> setting state version to 18.09, which doesn't seem correct given that my stateVersion is set to 'unstable'.

Adding --debug shows the following, should this contain my configuration?

{
  pi = { config, lib, pkgs, ... }: {
    config = {
      boot.kernelModules = [];
      networking = {
        extraHosts = "127.0.0.1 pi-encrypted\n";
        firewall.trustedInterfaces = [];
        vpnPublicKey = "ssh-ed25519 [SNIP] NixOps VPN key of pi";
      };
      system.stateVersion = ( lib.mkDefault "18.09" );
    };
    imports = [
      {
        config.users.extraUsers.root.openssh.authorizedKeys.keys = [
          "ssh-ed25519 [SNIP] NixOps client key for pi"
        ];
      }
    ];
  };
}

It looks like the configuration is completely ignored, which suggests I must be doing something very wrong? I think I've followed the steps in the manual closely, and I gave them all above so please let me know if I'm missing something.

kwohlfahrt commented 6 years ago

As advised on IRC, setting nixpkgs.crossSystem.system = "aarch64" resulted in a whole bunch of packages being built, which eventually falls over on libksba, due to not finding libgpg-error. I'm not sure if this is progress or not...

k4lipso commented 3 years ago

I got the same error while deploying from x86_64-linux to aarch64-linux (raspberry pi 3). Adding nixpkgs.system = "aarch64-linux"; to the configuration of the raspberry pi worked for me.

If you're deploying to machine of a different arch of the deploy host, you need to add (example where the target host is aarch64-linux):

nixpkgs.system = "aarch64-linux";

in the machine's config. This has worked fine for me.

deifactor commented 3 years ago

If I do that, it falls over with the error

error: a 'aarch64-linux' with features {} is required to build '/nix/store/j9dyf8pkwx47acbds22nlpy74796vc1k-append-initrd-secrets.drv', but I am a 'x86_64-linux' with features {benchmark, big-parallel, kvm, nixos-test}

kwohlfahrt commented 3 years ago

Just to follow on from my comment (from two years ago, wow!) I've been using NixOps to manage my Raspberry Pi since then successfully. The setting I used is nixpkgs.localSystem.system = "aarch64-linux" (which the manual recommends over nixpkgs.system).

@deifactor - yes you'll need a aarch64-linux machine to build any packages deployed to an aarch64 machine. However, this will mostly be configuration files that are "built" so a raspberry pi can build it's own system easily (everything else will be fetched from the cache.nixos.org). To use this with NixOps, I added my raspberry pi as a remote builder for the machine I run NixOps on (though this does pose a bit of a bootstrapping problem...).

So the overall setup is:

k4lipso commented 3 years ago

@kwohlfahrt as far as I know it is enough to enable aarch64-linux emulation on the building machine using: boot.binfmt.emulatedSystems = [ "aarch64-linux" ];

At least it works for me. So I think the remote builder is not strictly necessary.

deifactor commented 3 years ago

For future reference: I wound up just switching off of NixOps because this is just a single-person deploy, fwiw. So now I just have a deploy.sh script containing

#!/usr/bin/sh
dir=$(dirname $0)
NIXOS_CONFIG=$dir/configuration.nix nixos-rebuild --fast switch --target-host root@key-of-chronology.local --build-host root@key-of-chronology.local

and I run that on my laptop redeploy.