Mic92 / sops-nix

Atomic secret provisioning for NixOS based on sops
MIT License
1.63k stars 155 forks source link

Secrets not showing up in /run/secrets after reboot #149

Open DanielFabian opened 2 years ago

DanielFabian commented 2 years ago

I just followed the tutorial and tried setting up wifi for my nixos install using nix-sops and it seems to work just fine when I make a nixos-rebuild switch.

However, when I reboot, for some reason, the /run/secrets directory isn't created. Doing another nixos-rebuild switch (without any changes at all), fixes it again until I reboot.

https://github.com/DanielFabian/.dotfiles/commit/230b4ecf2b017252926f680b7091a94f8634da6f

Any idea what I might be doing wrong? Or is it perhaps a bug that it's not setting up the secrets during the boot.

Mic92 commented 2 years ago

Can you get boot logs? Is your home that has your key stored on a different partition that is not present yet during boot?

0x4A6F commented 2 years ago

I had my secrets located in a different ZFS dataset (!=rootfs) and thus also experienced this behaviour. The ZFS dataset (mounted in 'Local File Systems' systemd target) was mounted after stage-2-init: setting up secrets....

After moving my sops secrets to my rootfs it does set up secrets during boot.

DanielFabian commented 2 years ago

ah, interesting, I'll look at it. And to answer that, yes, totally. In ZFS I have tons of datasets. In particular my user's home directory is a dataset.

Mic92 commented 2 years ago

Ok. Unfortunately we cannot handle this usecase right now since we need to run before any additional mounts trigged by zfs or by systemd mount units are loaded.

DanielFabian commented 2 years ago

so which file exactly do I have to move then? Is it the encrypted sops file, or what?

Mic92 commented 2 years ago

You need to move your age key: https://github.com/DanielFabian/.dotfiles/blob/230b4ecf2b017252926f680b7091a94f8634da6f/system/configuration.nix#L22

DanielFabian commented 2 years ago

thanks, I'll give it a go.

DanielFabian commented 2 years ago

thanks, this worked. In the end, I just use the host's SSH key and it's fine.

Mic92 commented 2 years ago

I think we should mention this in documentation at least.

drestrepom commented 1 year ago

I have the same problem, I am trying to add an ssh key in the shell init, but after reboot this not works

/run/secrets/ssh_key: No such file or directory
  programs.zsh = {
    enable = true;
    autosuggestions.enable = true;
    interactiveShellInit = ''
      export DIRENV_WARN_TIMEOUT=1h
      source <(direnv hook zsh)
      ssh-add ${config.sops.secrets.ssh_key.path}
    '';
  };

This is the commit with all changes to use sops-nix https://github.com/drestrepom/env/commit/17aa1b33307d1ad548e05ef957250509b84160f9

Mic92 commented 1 year ago

There should be an error from the activation phase visible when running nixos-rebuild switch.

drestrepom commented 1 year ago

There should be an error from the activation phase visible when running nixos-rebuild switch.

sudo nixos-rebuild switch --flake .#

warning: Git tree '/home/nixos/env' is dirty
building the system configuration...
warning: Git tree '/home/nixos/env' is dirty
updating GRUB 2 menu...
Warning: os-prober will be executed to detect other bootable partitions.
Its output will be used to detect bootable binaries on them and create new boot entries.
lsblk: /dev/mapper/no*[0-9]: not a block device
lsblk: /dev/mapper/block*[0-9]: not a block device
lsblk: /dev/mapper/devices*[0-9]: not a block device
lsblk: /dev/mapper/found*[0-9]: not a block device
activating the configuration...
setting up /etc...
reloading user units for nixos...
setting up tmpfiles

after executing sudo nixos-rebuild switch --flake .# my keys is correctly added

Identity added: /run/secrets/ssh_key (/run/secrets/ssh_key)

This is the file with the secrets configuration defult.nix

zarelit commented 1 year ago

I think this is related to #24 isn't it? The key is on some filesystem that is not yet available in stage2 and maybe can't be set neededForBoot

dpc commented 1 year ago

Hi. I'm hitting this issue in AWS when using KMS. sops activation script fails at the boot, but seems to work if nixos-rebuild is re-run later on. I'm not sure why - I'm using EC2 "instance profile" to give EC2 permissions to access KMS.

If I understand correctly these permissions are available (and fetched by aws cli) from the link local virtual metdata server, so I guess at the point where sops is running networking (even link local?) might not be available? Just guessing.

Apr 21 17:11:29 localhost stage-2-init: /nix/store/fhn9wc7vsb2id3kka1ym38vfs47vzvsh-sops-install-secrets-0.0.1/bin/sops-install-secrets: Failed to decrypt '/nix/store/r2jfa0mndlc7a79xlra34wbidjknp9i8-secrets.yaml': Error getting data key: 0 successful groups required, got 0

If I'm reading this right, this happens barely after / is re-mounted.

Mic92 commented 1 year ago

AWS KMS won't correctly work unless you enable networking in the initrd, see https://github.com/Mic92/sops-nix/issues/24 The proper solution would be to have decryption in a systemd service that depends on network but this will make it also more complex for all non-KMS user. So far I either used vault https://github.com/numtide/systemd-vaultd in those situations or just decrypted the sops files without sops nix in a service:

{
  systemd.services.decrypt-sops = {
      description = "Decrypt sops secrets";
      wantedBy = [ "multi-user.target" ];
      after = [ "network.target" ];
      serviceConfig = {
        Type = "oneshot";
        RemainAfterExit = true;
        # in network is not ready
        Restart = "on-failure";
        RestartSec = "2s";
      };
      script = ''
        umask 077
        ${lib.getExe pkgs.sops} -d ${sops/key} > /run/keys/key
      '';
   };
   systemd.services.some-service = {
     after = [ "decrypt-sops.service" ];
     requires = [ "decrypt-sops.service" ]; 
   };                                        
}
dpc commented 1 year ago

I worked-around so far with:

  systemd.services.npcnix-force-rebuild-sops-hack = {
    wantedBy = [ "multi-user.target" ];
    serviceConfig = {
      ExecStart = ''
        /run/current-system/activate
      '';
      Type = "oneshot";
      Restart = "on-failure"; # because oneshot
      RestartSec = "10s";
    };
  };

but calling less is better.

BTW. Maybe sops-nix could expose it as a custom module, and then people who need it can just import = [ ... ]; it.

Also exposing a command to call (wrapper around ${lib.getExe pkgs.sops} -d ${sops/key} > /run/keys/key I gues) would also help people who need to put secret decryption in some non-trivial place.

TLATER commented 1 year ago

Ok. Unfortunately we cannot handle this usecase right now since we need to run before any additional mounts trigged by zfs or by systemd mount units are loaded.

Why is this the case? Just some silly ordering specific to how NixOS mounts filesystems, or something more annoying?

I'd like to use this on a system with impermanence, which does some bind-mounting between different partitions/{btr,z}fs volumes to make persistent data not disappear during reboots. This limitation is a bit awkward for that, and I'm sure many other use cases.

Shados commented 1 year ago

I'd like to use this on a system with impermanence, which does some bind-mounting between different partitions/{btr,z}fs volumes to make persistent data not disappear during reboots. This limitation is a bit awkward for that, and I'm sure many other use cases.

You'll need to ensure that the persistent filesystems/datasets/whatevers, including the one(s) containing the sops key(s) used for decryption, have neededForBoot set so that they're mounted early enough. Aside from that, impermanence runs using activation scripts* and doesn't depend on mount units, so all that's possibly-needed there is to ensure that sops' activation script runs after impermanence's. Enjoy:

{ config, lib, ... }:
let
  inherit (lib) filterAttrs mkIf;
  regularSecrets = filterAttrs (n: v: !v.neededForUsers) config.sops.secrets;
in
{
  # Ensure non-users-secrets from sops are only initialised *after*
  # impermanence's persistence module has linked files into place, otherwise we
  # likely do not have the decryption key (which is most-frequently the ssh
  # host key).
  config = mkIf (regularSecrets != {} && config.environment.persistence != {}) {
    system.activationScripts.setupSecrets.deps = [ "persist-files" ];
  };
}

*: It additionally uses per-file service units, but it does initial setup of the links/mounts via an activation script, so should be fine.

mkurkov commented 1 year ago

Just ran into this issue. We are using AWS KMS for keys with instance roles for access. Had to depend on network-online target, it seems just network is still too early for loading AWS role creds. Decided to use setupSecrets from activation scripts as it simple and allows to use all sops-nix features. Would be great to have native option for using service instead of activation script. Here is the code:

{
  systemd.services.decrypt-sops = {
    description = "Decrypt sops secrets";
    wantedBy = [ "multi-user.target" ];
    after = [ "network-online.target" ];
    serviceConfig = {
      Type = "oneshot";
      RemainAfterExit = true;
      # in network is not ready
      Restart = "on-failure";
      RestartSec = "2s";
    };
    script = config.system.activationScripts.setupSecrets.text;
   };
}
torgeir commented 1 year ago

It would be helpful if this was part of the documentation!

Fixed a similar issue to that of @DanielFabian by moving the age key, out of /etc/nixos that i had symlinked from /home/torgeir/nixos for ease of sudo), to /etc/ and doing

sops.age.keyFile = "/etc/nix-sops-smb.key";
birkb commented 9 months ago

@mkurkov config.system.activationScripts.setupSecrets.text does not work for me. nixos-rebuild complains about an unknown setupSecrets attribute. If i use

ExecStart = ''
        /run/current-system/activate
      '';

as shown by dpc it works.

mkurkov commented 9 months ago

@birkb I guess you are using neededForUsers flag for your secrets. In that case they will be activated in config.system.activationScripts.setupSecretsForUsers script instead of setupSecrets. It's a workaround for secrets that should be created before users, so I'm not sure how it works in your case, as here we are trying to postpone secrets activation to later time, after network is available. Maybe you don't need neededForUsers flag at all.