nix-community / NixOS-WSL

NixOS on WSL(2) [maintainer=@nzbr]
Apache License 2.0
1.86k stars 118 forks source link

systemd-tmpfiles-setup-dev.service: Failed to set up credentials: Protocol error #185

Closed GuillaumeDesforges closed 1 year ago

GuillaumeDesforges commented 1 year ago

Bug description

Rebuilding my configuration does not fail, but systemd is not happy.

$ sudo nixos-rebuild switch --flake path:.
building the system configuration...
activating the configuration...
Copying /usr/share/applications
Copying /usr/share/icons
setting up /etc...
setting up /bin...
reloading user units for gdforj...
setting up tmpfiles
warning: the following units failed: systemd-sysctl.service, systemd-tmpfiles-setup-dev.service

× systemd-sysctl.service - Apply Kernel Variables
     Loaded: loaded (/etc/systemd/system/systemd-sysctl.service; enabled; preset: enabled)
    Drop-In: /nix/store/6kph7z01rwc96sxnl0a5w3n5xzl3syq6-system-units/systemd-sysctl.service.d
             └─overrides.conf
     Active: failed (Result: exit-code) since Mon 2022-12-19 18:27:43 CET; 309ms ago
   Duration: 3min 23.960s
       Docs: man:systemd-sysctl.service(8)
             man:sysctl.d(5)
    Process: 27705 ExecStart=/nix/store/9rjdvhq7hnzwwhib8na2gmllsrh671xg-systemd-252.1/lib/systemd/systemd-sysctl (code=exited, status=243/CREDENTIALS)
   Main PID: 27705 (code=exited, status=243/CREDENTIALS)
         IP: 0B in, 0B out

Dec 19 18:27:43 kaguya systemd[1]: Starting Apply Kernel Variables...
Dec 19 18:27:43 kaguya systemd[27705]: systemd-sysctl.service: Failed to set up credentials: Protocol error
Dec 19 18:27:43 kaguya systemd[27705]: systemd-sysctl.service: Failed at step CREDENTIALS spawning /nix/store/9rjdvhq7hnzwwhib8na2gmllsrh671xg-systemd-252.1/lib/systemd/systemd-sysctl: Protocol error
Dec 19 18:27:43 kaguya systemd[1]: systemd-sysctl.service: Main process exited, code=exited, status=243/CREDENTIALS
Dec 19 18:27:43 kaguya systemd[1]: systemd-sysctl.service: Failed with result 'exit-code'.
Dec 19 18:27:43 kaguya systemd[1]: Failed to start Apply Kernel Variables.

× systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev
     Loaded: loaded (/etc/systemd/system/systemd-tmpfiles-setup-dev.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-12-19 18:27:43 CET; 310ms ago
   Duration: 3min 23.961s
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
    Process: 27707 ExecStart=systemd-tmpfiles --prefix=/dev --create --boot (code=exited, status=243/CREDENTIALS)
   Main PID: 27707 (code=exited, status=243/CREDENTIALS)
         IP: 0B in, 0B out

Dec 19 18:27:43 kaguya systemd[1]: Starting Create Static Device Nodes in /dev...
Dec 19 18:27:43 kaguya systemd[27707]: systemd-tmpfiles-setup-dev.service: Failed to set up credentials: Protocol error
Dec 19 18:27:43 kaguya systemd[27707]: systemd-tmpfiles-setup-dev.service: Failed at step CREDENTIALS spawning systemd-tmpfiles: Protocol error
Dec 19 18:27:43 kaguya systemd[1]: systemd-tmpfiles-setup-dev.service: Main process exited, code=exited, status=243/CREDENTIALS
Dec 19 18:27:43 kaguya systemd[1]: systemd-tmpfiles-setup-dev.service: Failed with result 'exit-code'.
Dec 19 18:27:43 kaguya systemd[1]: Failed to start Create Static Device Nodes in /dev.
warning: error(s) occurred while switching to the new configuration

To Reproduce

I used this configuration:

{ inputs, config, pkgs, ... }:

{
  imports = [
    inputs.nixos-wsl.nixosModules.wsl
    inputs.home-manager.nixosModules.home-manager
    ../../users/gdforj/configuration.nix
  ];

  # hostname
  networking.hostName = "kaguya";

  # WSL
  wsl = {
    enable = true;
    wslConf.automount.root = "/mnt";
    defaultUser = "gdforj";
    startMenuLaunchers = true;

    # Enable integration with Docker Desktop (needs to be installed)
    docker-desktop.enable = true;
  };

  # set Nix config
  nix = {
    registry.nixpkgs.flake = inputs.nixpkgs;
    package = pkgs.nixUnstable;
    extraOptions = ''
      experimental-features = nix-command flakes
    '';
    settings.trusted-users = ["gdforj"];
  };
  nixpkgs.config = {
    allowUnfree = true;
    cudaSupport = true;
  };

  # lang/region/keymap
  console = {
    font = "Lat2-Terminus16";
    keyMap = "fr";
  };
  i18n = {
    defaultLocale = "en_US.UTF-8";
  };
  time.timeZone = "Europe/Paris";

  # TODO refactor
  home-manager.users.gdforj.home.sessionVariables = {
    DISPLAY=":0";
    GDK_DPI_SCALE="1.5";
  };

  system.stateVersion = "22.11";
}

with inputs:

Inputs:
├───agenix: github:ryantm/agenix/a630400067c6d03c9b3e0455347dc8559db14288
│   └───nixpkgs: github:NixOS/nixpkgs/4428e23312933a196724da2df7ab78eb5e67a88e
├───flake-utils: github:numtide/flake-utils/5aed5285a952e0b949eb3ba02c12fa4fcfef535f
├───home-manager: github:rycee/home-manager/e7eba9cc46547ae86642ad3c6a9a4fb22c07bc26
│   ├───nixpkgs follows input 'nixpkgs'
│   └───utils: github:numtide/flake-utils/5aed5285a952e0b949eb3ba02c12fa4fcfef535f
├───nixos-wsl: github:nix-community/NixOS-WSL/fab2833c091e059fd75e0c2cd570279500e76351
│   ├───flake-compat: github:edolstra/flake-compat/009399224d5e398d03b22badca40a37ac85412a1
│   ├───flake-utils: github:numtide/flake-utils/5aed5285a952e0b949eb3ba02c12fa4fcfef535f
│   └───nixpkgs follows input 'nixpkgs'
└───nixpkgs: github:nixos/nixpkgs/3ff39f984faa5f528f7ac5e548110d4e20327aa1
GuillaumeDesforges commented 1 year ago

Seems related to https://github.com/microsoft/WSL/issues/9158

GuillaumeDesforges commented 1 year ago

This seems to be the main issue: https://github.com/microsoft/WSL/issues/9158#issuecomment-1324180686

/tmp/.X11-unix is now read-only

GuillaumeDesforges commented 1 year ago

I'm getting even more errors now

warning: the following units failed: systemd-sysctl.service, systemd-tmpfiles-setup-dev.service, systemd-tmpfiles-setup.service

× systemd-sysctl.service - Apply Kernel Variables
     Loaded: loaded (/etc/systemd/system/systemd-sysctl.service; enabled; preset: enabled)
    Drop-In: /nix/store/xhm370kb6avj18160b3a81ssg78chwha-system-units/systemd-sysctl.service.d
             └─overrides.conf
     Active: failed (Result: exit-code) since Tue 2022-12-20 22:04:15 CET; 546ms ago
       Docs: man:systemd-sysctl.service(8)
             man:sysctl.d(5)
    Process: 19732 ExecStart=/nix/store/9rjdvhq7hnzwwhib8na2gmllsrh671xg-systemd-252.1/lib/systemd/systemd-sysctl (code=exited, status=243/CREDENTIALS)
   Main PID: 19732 (code=exited, status=243/CREDENTIALS)
         IP: 0B in, 0B out

Dec 20 22:04:15 kaguya systemd[1]: Starting Apply Kernel Variables...
Dec 20 22:04:15 kaguya systemd[19732]: systemd-sysctl.service: Failed to set up credentials: Protocol error
Dec 20 22:04:15 kaguya systemd[19732]: systemd-sysctl.service: Failed at step CREDENTIALS spawning /nix/store/9rjdvhq7hnzwwhib8na2gmllsrh671xg-systemd-252.1/lib/systemd/systemd-sysctl: Protocol error
Dec 20 22:04:15 kaguya systemd[1]: systemd-sysctl.service: Main process exited, code=exited, status=243/CREDENTIALS
Dec 20 22:04:15 kaguya systemd[1]: systemd-sysctl.service: Failed with result 'exit-code'.
Dec 20 22:04:15 kaguya systemd[1]: Failed to start Apply Kernel Variables.

× systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev
     Loaded: loaded (/etc/systemd/system/systemd-tmpfiles-setup-dev.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Tue 2022-12-20 22:04:15 CET; 548ms ago
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
    Process: 19735 ExecStart=systemd-tmpfiles --prefix=/dev --create --boot (code=exited, status=243/CREDENTIALS)
   Main PID: 19735 (code=exited, status=243/CREDENTIALS)
         IP: 0B in, 0B out

Dec 20 22:04:15 kaguya systemd[1]: Starting Create Static Device Nodes in /dev...
Dec 20 22:04:15 kaguya systemd[19735]: systemd-tmpfiles-setup-dev.service: Failed to set up credentials: Protocol error
Dec 20 22:04:15 kaguya systemd[19735]: systemd-tmpfiles-setup-dev.service: Failed at step CREDENTIALS spawning systemd-tmpfiles: Protocol error
Dec 20 22:04:15 kaguya systemd[1]: systemd-tmpfiles-setup-dev.service: Main process exited, code=exited, status=243/CREDENTIALS
Dec 20 22:04:15 kaguya systemd[1]: systemd-tmpfiles-setup-dev.service: Failed with result 'exit-code'.
Dec 20 22:04:15 kaguya systemd[1]: Failed to start Create Static Device Nodes in /dev.

× systemd-tmpfiles-setup.service - Create Volatile Files and Directories
     Loaded: loaded (/etc/systemd/system/systemd-tmpfiles-setup.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Tue 2022-12-20 22:04:15 CET; 549ms ago
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
    Process: 19737 ExecStart=systemd-tmpfiles --create --remove --boot --exclude-prefix=/dev (code=exited, status=243/CREDENTIALS)
   Main PID: 19737 (code=exited, status=243/CREDENTIALS)
         IP: 0B in, 0B out

Dec 20 22:04:15 kaguya systemd[1]: Starting Create Volatile Files and Directories...
Dec 20 22:04:15 kaguya systemd[19737]: systemd-tmpfiles-setup.service: Failed to set up credentials: Protocol error
Dec 20 22:04:15 kaguya systemd[19737]: systemd-tmpfiles-setup.service: Failed at step CREDENTIALS spawning systemd-tmpfiles: Protocol error
Dec 20 22:04:15 kaguya systemd[1]: systemd-tmpfiles-setup.service: Main process exited, code=exited, status=243/CREDENTIALS
Dec 20 22:04:15 kaguya systemd[1]: systemd-tmpfiles-setup.service: Failed with result 'exit-code'.
Dec 20 22:04:15 kaguya systemd[1]: Failed to start Create Volatile Files and Directories.
warning: error(s) occurred while switching to the new configuration
GuillaumeDesforges commented 1 year ago

I fixed it using the fix from https://github.com/microsoft/WSL/issues/8996#issuecomment-1326840801

In my configuration module, I inserted

  systemd.services.nixs-wsl-systemd-fix = {
    description = "Fix the /dev/shm symlink to be a mount";
    unitConfig = {
      DefaultDependencies = "no";
      Before = [ "sysinit.target" "systemd-tmpfiles-setup-dev.service" "systemd-tmpfiles-setup.service" "systemd-sysctl.service" ];
      ConditionPathExists = "/dev/shm";
      ConditionPathIsSymbolicLink = "/dev/shm";
      ConditionPathIsMountPoint = "/run/shm";
    };
    serviceConfig = {
      Type = "oneshot";
      ExecStart = [
        "${pkgs.coreutils-full}/bin/rm /dev/shm"
        "/run/wrappers/bin/mount --bind -o X-mount.mkdir /run/shm /dev/shm"
      ];
    };
    wantedBy = [ "sysinit.target" ];
  };

EDIT: applied this fix https://github.com/nix-community/NixOS-WSL/issues/185#issuecomment-1367666676

GuillaumeDesforges commented 1 year ago

... or we could keep it open and see if we need to make a PR?

SuperSandro2000 commented 1 year ago

178 fixed this for native systemd.

@K900

K900 commented 1 year ago

We may want to do the same thing in syschdemd, but honestly I'd prefer we just throw it out and tell people to run native at this point. AFAIK the only remaining issue that affects native and not syschdemd is the environment variable thing.

GuillaumeDesforges commented 1 year ago

Sorry I lack context, do I need to use some other options to avoid this issue?

K900 commented 1 year ago

You may want to try wsl.nativeSystemd = true; if you're on a recent enough WSL version.

wrvsrx commented 1 year ago

I fixed it using the fix from microsoft/WSL#8996 (comment)

In my configuration module, I inserted

  systemd.services.nixs-wsl-systemd-fix = {
    description = "Fix the /dev/shm symlink to be a mount";
    unitConfig = {
      DefaultDependencies = "no";
      Before = "sysinit.target";
      ConditionPathExists = "/dev/shm";
      ConditionPathIsSymbolicLink = "/dev/shm";
      ConditionPathIsMountPoint = "/run/shm";
    };
    serviceConfig = {
      Type = "oneshot";
      ExecStart = [
        "${pkgs.coreutils-full}/bin/rm /dev/shm"
        "/run/wrappers/bin/mount --bind -o X-mount.mkdir /run/shm /dev/shm"
      ];
    };
    wantedBy = [ "sysinit.target" ];
  };

Before = "sysinit.targets"; should be changed to Before = [ "sysinit.target" "systemd-tmpfiles-setup-dev.service" "systemd-tmpfiles-setup.service" "systemd-sysctl.service" ];, otherwise this service might run after those services in before, which causes them fail to start.

KoviRobi commented 1 year ago

You may want to try wsl.nativeSystemd = true; if you're on a recent enough WSL version.

I'm having this problem with native systemd too:

❯ nixos-wsl-version --json
{
  "release": "DEV_BUILD",
  "rev": "9c0955ffd9501688bf9a5ca94e8405e5de59608f",
  "systemd": "native"
}

(Note 9c0955 is https://github.com/KoviRobi/NixOS-WSL/tree/rob)

❯ wsl.exe --version
WSL version: 1.0.3.0
Kernel version: 5.15.79.1
WSLg version: 1.0.47
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.19044.2486

The problem seems to be that for me /dev/shm is a directory when WSL2 starts up, I think it's because I haven't got the update that #199 talks about, so I had to run wsl.exe --update --pre-release, as the 1.1.0 from https://github.com/nix-community/NixOS-WSL/issues/179#issuecomment-1400736911 seemed to be a pre-release https://github.com/microsoft/WSL/releases/tag/1.1.0

K900 commented 1 year ago

The current nixos-wsl master should do the right thing on both the stable release of WSL and the preview one.

KoviRobi commented 1 year ago

For me this condition was preventing the shim from working:

https://github.com/nix-community/NixOS-WSL/blob/14273c185311868e8525a0af3b5c93503e0f53e8/scripts/native-systemd-shim/src/main.rs#L46-L53

because /dev/shim was a directory not a symlink. But I guess there is no easy way to tell if something is a directory that is a mount, or is not a mount? But WSL 1.0.3 worked when I added || medatada.is_directory()

Also above this never gets called with a directory, as it is:

https://github.com/nix-community/NixOS-WSL/blob/14273c185311868e8525a0af3b5c93503e0f53e8/scripts/native-systemd-shim/src/main.rs#L18-L22

K900 commented 1 year ago

I guess the check needs to be flipped somehow, to check if /dev/shm is a mount or not. Would you be interested in trying to fix this? If not, I'll probably get to it tomorrow-ish.

KoviRobi commented 1 year ago

Yeah I will give it a try, but not sure how to check if something is a mount or not, the nix::mount crate only seems to have a command for mounting, not for checking if something is a mount. From a quick google, I haven't seen a wrapper around https://docs.rs/libc/latest/libc/fn.getmntent.html so I might have to try making an iterator using the *mntent functions (and contribute it back 💚)

K900 commented 1 year ago

You can use https://doc.rust-lang.org/std/os/unix/fs/trait.MetadataExt.html#tymethod.dev - compare it between the directory itself and a temporary file created in it.

KoviRobi commented 1 year ago

I've had success with https://github.com/nix-community/NixOS-WSL/compare/main...KoviRobi:native-systemd-shim-check-mounts which uses https://github.com/nix-rust/nix/compare/master...KoviRobi:nix-rust:add-linux-mntent-libc-0.2.133

I'll work on getting that properly merged into nix-rust/nix tomorrow

K900 commented 1 year ago

Should be fixed in recent WSL releases.