NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.05k stars 1.47k forks source link

nix-store --repair-path fails on ZFS on EIO #8121

Open remexre opened 1 year ago

remexre commented 1 year ago

Describe the bug

ZFS can detect hardware-failure-related corruption of disk blocks. In situations where it cannot automatically recover the contents of the file, read()s from the file will return EIO.

nix-store --repair-path bails out when it gets these EIOs, and it's not clear whether it's possible to repair the files on a live NixOS system, since /nix/store appears read-only to... users' systemd slices? To sudo under a graphical shell, anyway.

Steps To Reproduce

  1. Break a mirror, incurring 7 years of bad luck.
  2. Run sudo zpool scrub whatever, then run sudo zpool status -v, notice an errors: Permanent errors have been detected in the following files: section.
  3. Run sudo nix-store --repair-path on one of the paths that appears.
  4. Get error: reading from file: Input/output error

(I expect a "synthetic" repro could use ptrace to inject EIOs?)

Expected behavior

The store path should be replaced with one that's freshly downloaded, without an error.

For massive bonus points, NixOS gets an option that hooks into services.zfs.autoScrub and --repair-paths any corrupted files in the Nix store once the scrub completes.

nix-env --version output

nix-env (Nix) 2.11.0

SuperSandro2000 commented 1 year ago

nix-store --repair-path bails out when it gets these EIOs, and it's not clear whether it's possible to repair the files on a live NixOS system, since /nix/store appears read-only to... users' systemd slices? To sudo under a graphical shell, anyway.

The nix store can repair the path by fetching it from a substituter or if it knows how to build, build it. I think it just needs to handle EIO correct.