NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.86k stars 1.52k forks source link

gc: resume GC after a pathinuse error #11922

Closed picnoir closed 1 day ago

picnoir commented 2 days ago

Motivation

invalidatePathChecked is throwing a PathInUse exception. This exception is not catched and fails the whole GC run.

Instead, I think we should log the error for the specific store path we're trying to delete, explaining we can't delete this path because it still has referrers. Once we're done with logging that, the GC run should continue to delete the dead store paths it can delete.

Context

I recently faced a bug that I assume is coming from the topoSortPaths function where the GC was trying to delete a path having some alive referrers. The GC run service was constantly failing with the error message

error: cannot delete path '/nix/store/r1lp9kxlrc6h7vrba90gm6i94s31xvvx-gnugrep-3.11' because it is in use by '/nix/store/911x30h15lbfg5fkkabzjhars2svbnaa-stdenv-linux'

I resolved this by manually deleting the faulty path referrers using nix-store --query --referrers and nix store delete. I sadly can't reproduce this bug. It seems to happen extremely infrequently.

This bug alone is not a massive deal due to its infrequent nature. However, the way it cascades is a serious issue for Nix builders. Because we're throwing an un-catched PathInUse exception, the full GC run fails. This prevents any automatic garbage collection of the nix store, and the machine disk is slowly but surely filling up. Up until the point where a manual intervention is required to fix the situation.

Priorities and Process

Add :+1: to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

edolstra commented 1 day ago

If this is an error that did happen but we haven't fixed yet

In that case there should at least be an GitHub issue so we can try to diagnose this (and a comment in the code to link to that issue).

picnoir commented 1 day ago

Created an issue: https://github.com/NixOS/nix/issues/11923