nix-community / impermanence

Modules to help you handle persistent state on systems with ephemeral root storage [maintainer=@talyz]
MIT License
1.14k stars 84 forks source link

Impermanence issues with SSH when provisioning with nixos-anywhere + disko + flakes #192

Open visualphoenix opened 3 months ago

visualphoenix commented 3 months ago

I encountered an issue when trying to use impermanence to mount SSH host keys in NixOS while provisioning a new host with nixos-anywhere + flakes + disko.

I believe the problem is because the SSH host keys are generated during at boot time. The nixos-anywhere provisioning process fails with the following output:

### Installing NixOS ###
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added 'xxx.xxx.xxx.xxx' (ED25519) to the list of known hosts.
installing the boot loader...
...
...
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/nix/store/1lksf0kkffcnw5l8ryq5imai8pdlpy13-bza6dmx1w5c0xrvs1m7704ijnzqcrsfi-systemd-boot", line 394, in <module>
    main()
  File "/nix/store/1lksf0kkffcnw5l8ryq5imai8pdlpy13-bza6dmx1w5c0xrvs1m7704ijnzqcrsfi-systemd-boot", line 377, in main
    install_bootloader(args)
  File "/nix/store/1lksf0kkffcnw5l8ryq5imai8pdlpy13-bza6dmx1w5c0xrvs1m7704ijnzqcrsfi-systemd-boot", line 267, in install_bootloader
    machine_id = subprocess.run(
                 ^^^^^^^^^^^^^^^
  File "/nix/store/7hnr99nxrd2aw6lghybqdmkckq60j6l9-python3-3.11.9/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/nix/store/gv0jmdv734pdxg6ilb4kq2np2fxxkr39-systemd-255.6/bin/systemd-machine-id-setup', '--print']' returned non-zero exit status 1.
installation finished!
umount: /mnt/boot unmounted
umount: /mnt/nix unmounted
umount: /mnt unmounted
### Waiting for the machine to become reachable again ###
ssh: connect to host xxx.xxx.xxx.xxx port 22: Connection refused
### Done! ###

As a workaround, I tried making the entire /etc/ssh directory persistent. This allows nixos-anywhere to provision the system successfully. However, on the first boot, the sshd_config file from the nix-store is not present in the /etc/ssh persistent directory.

A reproducible MVP is here: https://github.com/visualphoenix/nixos-anywhere-disko-impermanence-mvp

Expected behavior:

Actual behavior:

Please let me know if you need any further information or clarification regarding this issue.

Doosty commented 2 months ago

I could not install a nixos system with nixos-anywhere + disko unless i gave it empty lists of files to persist (environment.persistence."/persist/system".files&directories = []). And even afterwards when nixos-rebuilding an installed system i kept getting errors when trying to add /etc/machine-id or /etc/ssh/keyfiles... to persistence module. It seems the files need to not exist when the persistence module tries to bind them from /persist to /, so i solved the ssh keys by just deleting them before rebuilding, but i could not solve the /etc/machine-id in that way because it always gets recreated before the impermanence module execution. It would be nice to be able to install or rebuild if the files exist and just get a warning instead of a critical error. At this point im not sure which files are mandatory to persist and which arent, this is something i think the docs could explain better. Ill see how well the system holds up in the coming days.