NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.37k stars 13.6k forks source link

DynamicUser units race with nscd at system startup #328404

Open bennofs opened 1 month ago

bennofs commented 1 month ago

Describe the bug

When using DynamicUser=true for a nixos service, it can happen that the dynamic user is not found (does not exist) when the unit is executed.

This leads to errors like bin/chown: invalid user: where the chown is used to change ownership of files to the service.

Steps To Reproduce

The following flake shows the behaviour (I set Before = "nscd.service" to make the race reliable, but in practice this can also occur randomly by chance if there is no ordering dependency). On boot, test-service fails to start with:

/nix/store/php4qidg2bxzmm79vpri025bqi0fa889-coreutils-9.5/bin/id: ‘test-service’: no such user

However, after the system has booted (and nscd has started), starting test-service succeeds.

flake.nix for the VM ```nix { inputs.nixpkgs.url = github:NixOS/nixpkgs/63dacb46bf939521bdc93981b4cbb7ecb58427a0; outputs = { self, nixpkgs }: { packages.x86_64-linux.default = (nixpkgs.lib.nixosSystem { modules = [ ({ lib, pkgs, config, modulesPath, ... }: { users = { mutableUsers = false; users.root.password = ""; }; systemd.services.test-service = { enable = true; wantedBy = [ "multi-user.target" ]; before = [ "nscd.service" ]; serviceConfig = { Type = "oneshot"; DynamicUser = true; ExecStart = "${lib.getBin pkgs.coreutils}/bin/id test-service"; }; }; services.getty.autologinUser = "root"; # default configuration from nixos-generate-config follow imports = [ (modulesPath + "/profiles/qemu-guest.nix") ]; networking.hostName = "base"; system.stateVersion = "24.05"; # Use the systemd-boot EFI boot loader. boot.loader.systemd-boot.enable = true; boot.loader.efi.canTouchEfiVariables = true; boot.initrd.availableKernelModules = [ "ahci" "xhci_pci" "virtio_pci" "sr_mod" "virtio_blk" "virtio_console" ]; boot.initrd.kernelModules = [ ]; boot.kernelModules = [ "kvm-intel" ]; boot.extraModulePackages = [ ]; boot.kernelParams = [ "console=tty0" "console=ttyS0" ]; fileSystems."/" = { device = "/dev/disk/by-label/root"; fsType = "ext4"; }; fileSystems."/boot" = { device = "/dev/disk/by-label/esp"; fsType = "vfat"; }; swapDevices = [ ]; # Enables DHCP on each ethernet and wireless interface. In case of scripted networking # (the default) this is the recommended approach. When using systemd-networkd it's # still possible to use this option, but it's recommended to use it in conjunction # with explicit per-interface declarations with `networking.interfaces..useDHCP`. networking.useDHCP = lib.mkDefault true; networking.useNetworkd = true; nixpkgs.hostPlatform = lib.mkDefault "x86_64-linux"; }) ]; }).config.system.build.vm; }; } ```

Expected behavior

Edit with DynamicUser = true should work robustly also at system boot.

Workaround

Add After = nss-user-lookup.target to the service config.

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.6.32, NixOS, 24.05 (Uakari), 24.05.20240531.63dacb4`
 - multi-user?: `no`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.2`
 - nixpkgs: `not found`

Add a :+1: reaction to issues you find important.

symphorien commented 1 month ago

See previous discussions there https://github.com/NixOS/nixpkgs/pull/105354