NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.39k stars 14.34k forks source link

make-disk-image.nix: `cannot initialize fsdev 'sa'` #359782

Closed yellowhat closed 1 day ago

yellowhat commented 4 days ago

Describe the bug

Hi, consider the following files:

flake.nix ```nix { inputs = { nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable"; }; outputs = { nixpkgs, ... }@attrs: let system = "x86_64-linux"; in { nixosConfigurations = { qcow2 = nixpkgs.lib.nixosSystem { inherit system; modules = [ ./qcow2.nix ]; }; }; }; } ```
qcow2.nix ```nix { config, lib, pkgs, modulesPath, ... }: { imports = [ "${modulesPath}/profiles/qemu-guest.nix" ]; boot = { loader.systemd-boot.enable = true; kernelPackages = pkgs.linuxPackages_6_11; }; # These labels are set/expected by `make-disk-image.nix` fileSystems = { "/boot" = { device = "/dev/disk/by-label/ESP"; fsType = "vfat"; }; "/" = { device = "/dev/disk/by-label/nixos"; fsType = "ext4"; autoResize = true; }; }; system.build.qcow2 = import "${modulesPath}/../lib/make-disk-image.nix" { inherit config lib pkgs; diskSize = 20480; format = "qcow2-compressed"; partitionTableType = "efi"; copyChannel = false; }; } ```

after updating to the latest nixos-unstable:

flake.lock ```json { "nodes": { "nixpkgs": { "locked": { "lastModified": 1732521221, "narHash": "sha256-2ThgXBUXAE1oFsVATK1ZX9IjPcS4nKFOAjhPNKuiMn0=", "owner": "NixOS", "repo": "nixpkgs", "rev": "4633a7c72337ea8fd23a4f2ba3972865e3ec685d", "type": "github" }, "original": { "owner": "NixOS", "ref": "nixos-unstable", "repo": "nixpkgs", "type": "github" } }, "root": { "inputs": { "nixpkgs": "nixpkgs" } } }, "root": "root", "version": 7 } ```

If I run:

nix build .#nixosConfigurations.qcow2.config.system.build.qcow2 --print-build-logs

I get the following error:

nixos-disk-image> WARNING: Image format was not specified for 'nixos.raw' and probing guessed raw.
nixos-disk-image>          Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
nixos-disk-image>          Specify the 'raw' format explicitly to remove the restrictions.
nixos-disk-image> qemu-kvm: -virtfs local,path=/build,security_model=none,mount_tag=sa: cannot initialize fsdev 'sa': failed to open '/build': No such file or directory
error: builder for '/nix/store/nyvlsqnawm657ddfz9kswgxdylvkk8lr-nixos-disk-image.drv' failed with exit code 1

If I revert to nixos-unstable:

flake.lock ```json { "nodes": { "nixpkgs": { "locked": { "lastModified": 1732014248, "narHash": "sha256-y/MEyuJ5oBWrWAic/14LaIr/u5E0wRVzyYsouYY3W6w=", "owner": "NixOS", "repo": "nixpkgs", "rev": "23e89b7da85c3640bbc2173fe04f4bd114342367", "type": "github" }, "original": { "owner": "NixOS", "ref": "nixos-unstable", "repo": "nixpkgs", "type": "github" } }, "root": { "inputs": { "nixpkgs": "nixpkgs" } } }, "root": "root", "version": 7 } ```

It works as expected:

nixos-disk-image> WARNING: Image format was not specified for 'nixos.raw' and probing guessed raw.
nixos-disk-image>          Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
nixos-disk-image>          Specify the 'raw' format explicitly to remove the restrictions.
nixos-disk-image> cSeaBIOS (version rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org)
nixos-disk-image> iPXE (http://ipxe.org) 00:03.0 CA00 PCI2.10 PnP PMM+3EFD0CF0+3EF30CF0 CA00
nixos-disk-image> Booting from ROM...
nixos-disk-image> Probing EDD (edd=off to disable)... ocloading kernel modules...
nixos-disk-image> mounting Nix store...
nixos-disk-image> mounting host's temporary directory...
nixos-disk-image> starting stage 2 (/nix/store/mslzxrnwm7hmykqkrryyq0nj4rk45845-vm-run-stage2)
nixos-disk-image> tune2fs 1.47.1 (20-May-2024)
nixos-disk-image> Setting maximal mount count to -1
nixos-disk-image> Setting interval between checks to 0 seconds
nixos-disk-image> Setting time filesystem last checked to Thu Nov 28 08:10:35 2024
nixos-disk-image> mkfs.fat 4.2 (2021-01-31)
nixos-disk-image> setting up /etc...
nixos-disk-image> Initializing machine ID from random generator.
nixos-disk-image> Created "/boot/EFI".
nixos-disk-image> Created "/boot/EFI/systemd".
nixos-disk-image> Created "/boot/EFI/BOOT".
nixos-disk-image> Created "/boot/loader".
nixos-disk-image> Created "/boot/loader/entries".
nixos-disk-image> Created "/boot/EFI/Linux".
nixos-disk-image> Copied "/nix/store/ivqjhj99firnjq7gp14qf35821viwi5m-systemd-256.7/lib/systemd/boot/efi/systemd-bootx64.efi" to "/boot/EFI/systemd/systemd-bootx64.efi".
nixos-disk-image> Copied "/nix/store/ivqjhj99firnjq7gp14qf35821viwi5m-systemd-256.7/lib/systemd/boot/efi/systemd-bootx64.efi" to "/boot/EFI/BOOT/BOOTX64.EFI".
nixos-disk-image> ⚠️ Mount point '/boot' which backs the random seed file is world accessible, which is a security hole! ⚠️
nixos-disk-image> ⚠️ Random seed file '/boot/loader/.#bootctlrandom-seed3eccc68721bd85ee' is world accessible, which is a security hole! ⚠️
nixos-disk-image> Random seed file /boot/loader/random-seed successfully written (32 bytes).
nixos-disk-image> tune2fs 1.47.1 (20-May-2024)
nixos-disk-image> Setting maximal mount count to -1
nixos-disk-image> Setting interval between checks to 0 seconds
nixos-disk-image> Setting time filesystem last checked to Thu Nov 28 08:10:36 2024
nixos-disk-image> tune2fs 1.47.1 (20-May-2024)
nixos-disk-image> Setting time filesystem last checked to Thu Jan  1 00:00:00 1970
nixos-disk-image> [    2.623511] reboot: Power down

Metadata

$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.11.9`
 - multi-user?: `yes`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.25.2`
 - channels(root): `""`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixpkgs`

Notify maintainers

@phaer


Note for maintainers: Please tag this issue in your PR.


Add a :+1: reaction to issues you find important.

yellowhat commented 4 days ago

Interestingly if I run:

mkdir /build
nix build .#nixosConfigurations.qcow2.config.system.build.qcow2 --print-build-logs

It shows a different error:

nixos-disk-image> WARNING: Image format was not specified for 'nixos.raw' and probing guessed raw.
nixos-disk-image>          Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
nixos-disk-image>          Specify the 'raw' format explicitly to remove the restrictions.
nixos-disk-image> cSeaBIOS (version rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org)
nixos-disk-image> iPXE (http://ipxe.org) 00:03.0 CA00 PCI2.10 PnP PMM+3EFD0CA0+3EF30CA0 CA00
nixos-disk-image> Booting from ROM...
nixos-disk-image> Probing EDD (edd=off to disable)... ocloading kernel modules...
nixos-disk-image> mounting Nix store...
nixos-disk-image> mounting host's build directory...
nixos-disk-image> starting stage 2 (/nix/store/wmp357hsbipw1hv71q39l24aj50llvnj-vm-run-stage2)
nixos-disk-image> /nix/store/wmp357hsbipw1hv71q39l24aj[5  0llvnj-v m -0.86r9939] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
nixos-disk-image> [    0.870687] CPU: 14 PID: 1 Comm: wmp357hsbipw1hv Not tainted 6.6.63 #1-NixOS
nixos-disk-image> un[    0-.s8t7a1g2e520:]  lHinaer dware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
nixos-disk-image> [    0.872244] Call Trace:
nixos-disk-image> 3: [/ b u i l0d./8x7c2h4g4/2s]  <TASK>
nixos-disk-image> ave[   d 0.-8e7n2v7:3 5N]o   sudcump_stack_lvl+0x47/0x70
nixos-disk-image> [    0.873161]  panic+0x180/0x340
nixos-disk-image> h fi[l e   o r0 .d8i73r4e0c1t]  do_exit+0x956/0xad0
nixos-disk-image> [    0.873776]  do_group_exit+0x31/0x80
nixos-disk-image>  ry[
nixos-disk-image>    0.874060]  __x64_sys_exit_group+0x18/0x20
nixos-disk-image> [    0.874413]  do_syscall_64+0x39/0x90
nixos-disk-image> [    0.874684]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
nixos-disk-image> [    0.875083] RIP: 0033:0x7fe0598ebd1d
nixos-disk-image> [    0.875358] Code: 45 31 c0 45 31 d2 45 31 db c3 0f 1f 00 f3 0f 1e fa 48 8b 35 e5 e0 10 00 ba e7 00 00 00 eb 07 66 0f 1f 44 00 00 f4 89 d0 0f 05 <48> 3d 00 f0 ff ff 76 f3 f7 d8 64 89 06 eb ec 0f 1f 40 00 f3 0f 1e
nixos-disk-image> [    0.876676] RSP: 002b:00007ffe7e646578 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
nixos-disk-image> [    0.877246] RAX: ffffffffffffffda RBX: 00007fe0599fbfa8 RCX: 00007fe0598ebd1d
nixos-disk-image> [    0.877742] RDX: 00000000000000e7 RSI: ffffffffffffff88 RDI: 0000000000000001
nixos-disk-image> [    0.878253] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
nixos-disk-image> [    0.878741] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
nixos-disk-image> [    0.879250] R13: 0000000000000001 R14: 00007fe0599fa680 R15: 00007fe0599fbfc0
nixos-disk-image> [    0.879766]  </TASK>
nixos-disk-image> [    0.880314] Kernel Offset: 0x1b400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
nixos-disk-image> [    0.881383] Rebooting in 1 seconds..
nixos-disk-image> Virtual machine didn't produce an exit code.
error: builder for '/nix/store/nyvlsqnawm657ddfz9kswgxdylvkk8lr-nixos-disk-image.drv' failed with exit code 1
phaer commented 3 days ago

Not entirely sure why you tagged me here @yellowhat? I am not the maintainer of that module, I just had the last commit to it at the time you submitted this issue.

Do you have any indicator your bug is related to that commit, that doesn't seem likely to me at first glance?

Looks quite unusual not to have /build inside the sandbox. Just creating it in your shell before nix build can't help here; so the kernel panic isn't entirely unexpected I think.

Have you tried minimizing the example (default kernel, etc) already? If so, I am afraid bisecting it might be your best option,

yellowhat commented 3 days ago

@phaer Sorry I cannot figure out who is the maintainer of that module.

Yes, the flake.nix and qcow2.nix files above makes it reproducible, at least the minimum I can think of.

phaer commented 3 days ago

@yellowhat I don't think it has one.

I also can't reproduce this. Builds fine for me with nixos unstable 23e89b7da85c3640bbc2173fe04f4bd114342367 (as in your first flake.lock).

yellowhat commented 3 days ago

Sorry but the "broken" commit is https://github.com/NixOS/nixpkgs/commit/4633a7c72337ea8fd23a4f2ba3972865e3ec685d (currently the latest) not https://github.com/NixOS/nixpkgs/commit/23e89b7da85c3640bbc2173fe04f4bd114342367.

Could you retry?

phaer commented 3 days ago

I did, it builds here on x86_64-linux.

{
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/4633a7c72337ea8fd23a4f2ba3972865e3ec685d";
  };

  outputs = { nixpkgs, ... }@attrs:
    let
      system = "x86_64-linux";
    in {
      nixosConfigurations = {
        qcow2 = nixpkgs.lib.nixosSystem {
          inherit system;
          modules = [ 

          (

{
  config,
  lib,
  pkgs,
  modulesPath,
  ...
}:
{

  imports = [ "${modulesPath}/profiles/qemu-guest.nix" ];

  boot = {
    loader.systemd-boot.enable = true;
    kernelPackages = pkgs.linuxPackages_6_11;
  };

  # These labels are set/expected by `make-disk-image.nix`
  fileSystems = {
    "/boot" = {
      device = "/dev/disk/by-label/ESP";
      fsType = "vfat";
    };
    "/" = {
      device = "/dev/disk/by-label/nixos";
      fsType = "ext4";
      autoResize = true;
    };
  };

  system.build.qcow2 = import "${modulesPath}/../lib/make-disk-image.nix" {
    inherit config lib pkgs;
    diskSize = 20480;
    format = "qcow2-compressed";
    partitionTableType = "efi";
    copyChannel = false;
  };
} 

          )

          ];
        };
      };
    };
}
yellowhat commented 3 days ago

I do not get it, I am running:

args=(
    --interactive
    --tty
    --rm
    --device /dev/kvm
    --volume "${PWD}:/data:z"
    --workdir /data
)
podman run "${args[@]}" docker.io/nixos/nix:latest

nix --extra-experimental-features "flakes nix-command" build .#nixosConfigurations.qcow2.config.system.build.qcow2 --print-build-logs

and exactly the same flake.nix that you posted.

I have been running in this way in the last 8 months almost daily.

phaer commented 3 days ago

Ah, you are running this in a docker container and passing through /dev/kvm from a, presumably non-nixos, host? I am running this on a nixos-unstable host with the same kernel as inside the vm.

That of course should not make a difference, but I think it's notable and think you might have to bisect this to find out where exactly it started to fail. Could also check whether the same setup works with eg a non-efi image with as many default settings as possible.

Anyway, I don't think I can help here. Feel free to ping me again if it turns out that my option changes was the culprit after all, but I am unsubscribing here for now.

yellowhat commented 3 days ago

Thanks

The host itself is also nixos-unstable.

yellowhat commented 3 days ago

After running git bisect, the culprit is https://github.com/NixOS/nixpkgs/commit/97ed6b4565e76286062e6942517a71ae4c9cac72

@Ma27

Ma27 commented 2 days ago

Yeah I see why this fails, NIX_BUILD_TOP isn't /build and I guess I shouldn't have used /build in the first place. Will prepare a patch.

Ma27 commented 2 days ago

Actually, https://github.com/NixOS/nixpkgs/pull/360413 should be a potential fix.

yellowhat commented 2 days ago

Using:

{
  inputs = {
    nixpkgs.url = "github:wolfgangwalther/nixpkgs/structured-attrs-run-in-vm";
  };
...

fixes the error.

Thanks