NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.7k stars 13.84k forks source link

nixos-generate-config does not correctly determine current UUID #62444

Open nh2 opened 5 years ago

nh2 commented 5 years ago

Issue description

nixos-generate-config does not correctly determine current UUID of the root device, due to its reliance on cached information in /dev/disk/by-uuid (as opposed to e.g. blkid). This leads to machines not booting after nixos-install in many situations.

If you create a new file system on some block device using mkfs, it gets a new UUID.

If that block device already had a UUID before, /dev/disk/by-uuid will continue to show the old UUID until you run udevadm trigger.

nixos-generate-config relies on /dev/disk/by-uuid here:

https://github.com/NixOS/nixpkgs/blob/07fdacf9f9811c78804ce782a25e448fbb0ad946/nixos/modules/installer/tools/nixos-generate-config.pl#L296-L305

This means nixos-generate-config will put a wrong (no-longer-existent) UUID into hardware-configuration.nix, and the machine will fail to boot.

Workaround

Run udevadm trigger between your mkfs and nixos-generate-config.

Real fix

nixos-generate-config should run udevadm trigger or use an uncached source (like blkid or lsblk) to obtain the file system UUID.

Technical details

NixOS 19.03.

Found with the help of @cleverca22.

From IRC for further info:

clever> i think the cache is only to allow non-root users to query things

Here is a shell session that shows how /dev/disk/by-uuid shows outdated info:

root@rescue:~# mkfs.ext4 -F -L root /dev/md127
mke2fs 1.42.12 (29-Aug-2014)
/dev/md127 contains a ext4 file system labelled 'root'
    created on Sat Jun  1 19:05:34 2019
Creating filesystem with 121903104 4k blocks and 30482432 inodes
Filesystem UUID: 08609a04-21b7-4cf9-878e-1bcc5fdf7ce1
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
    102400000

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done     

root@rescue:~# lsblk --fs | grep md127
│ └─md127                  ext4              root             08609a04-21b7-4cf9-878e-1bcc5fdf7ce1   
│ └─md127                  ext4              root             08609a04-21b7-4cf9-878e-1bcc5fdf7ce1   
root@rescue:~# blkid | grep md127
/dev/md127: LABEL="root" UUID="08609a04-21b7-4cf9-878e-1bcc5fdf7ce1" TYPE="ext4"
root@rescue:~# ls -l /dev/disk/by-uuid | grep md127
lrwxrwxrwx 1 root root 11 Jun  1 19:05 d5b301c3-124e-48c9-9978-71ae88f2a2a9 -> ../../md127
root@rescue:~# 
root@rescue:~# udevadm trigger
root@rescue:~# 
root@rescue:~# lsblk --fs | grep md127
│ └─md127                  ext4              root             08609a04-21b7-4cf9-878e-1bcc5fdf7ce1   
│ └─md127                  ext4              root             08609a04-21b7-4cf9-878e-1bcc5fdf7ce1   
root@rescue:~# blkid | grep md127
/dev/md127: LABEL="root" UUID="08609a04-21b7-4cf9-878e-1bcc5fdf7ce1" TYPE="ext4"
root@rescue:~# ls -l /dev/disk/by-uuid | grep md127
lrwxrwxrwx 1 root root 11 Jun  1 19:10 08609a04-21b7-4cf9-878e-1bcc5fdf7ce1 -> ../../md127
stale[bot] commented 4 years ago

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
AkechiShiro commented 2 years ago

What would be a way to fix this issue ? What should be done to not rely on the cache of blkid or force it to refresh, is there a flag we could use ? Or a kernel module to reload to refresh blkid's cache ? Maybe using, the -g flag ? image

I can make a PR for that if you think, it would fix this issue.