openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.57k stars 1.74k forks source link

Dracut hangs on Debian #4518

Closed denizzzka closed 8 years ago

denizzzka commented 8 years ago

(on commit 84e15cc21acb145bf460989c3e1e37186e575a5e)

This is my disk config:

$ lsblk
NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda             8:0    0   1,8T  0 disk  <- new disk with zfs what I want to use as new / fs
├─sda1          8:1    0   100M  0 part  
├─sda2          8:2    0   300M  0 part  
│ └─md126       9:126  0 299,7M  0 raid1 /mnt/root/boot
├─sda3          8:3    0 465,3G  0 part  
│ └─mirrored1 253:1    0 465,3G  0 crypt <- zfs is here
└─sda4          8:4    0   1,4T  0 part  
sdb             8:16   0 465,8G  0 disk  <- current system disk
├─sdb1          8:17   0     1M  0 part  
├─sdb2          8:18   0   7,8G  0 part  
│ └─md124       9:124  0   7,8G  0 raid1 
├─sdb3          8:19   0  19,5G  0 part  
│ └─md127       9:127  0  19,5G  0 raid1 /
└─sdb4          8:20   0 438,4G  0 part  
  └─md125       9:125  0 438,3G  0 raid1 
    └─homefs  253:0    0 438,3G  0 crypt /mnt/cryptfs
# LANG=C dracut --regenerate-all --force
dracut: Executing: /usr/bin/dracut --kver=4.4.0-1-amd64 --force
dracut: dracut module 'bootchart' will not be installed, because command '/sbin/bootchartd' could not be found!
dracut: zfsexpandknowledge: pool rpool has device /dev/mapper/mirrored1
dracut: zfsexpandknowledge: pool rpool has device /dev/disk/by-id/
dracut: zfsexpandknowledge: block devices backing ZFS dataset /: /dev/dm-1
/dev/disk/by-id
dracut: zfsexpandknowledge: slave block device backing ZFS dataset /: /dev/sda3
device node not found
dracut: zfsexpandknowledge: host device /dev/md124
dracut: zfsexpandknowledge: host device /dev/dm-1
dracut: zfsexpandknowledge: host device /dev/sda3
dracut: zfsexpandknowledge: host device /dev/disk/by-id
dracut: zfsexpandknowledge: device /dev/dm-1 of type zfs_member
dracut: zfsexpandknowledge: device /dev/disk/by-id of type 
dracut: zfsexpandknowledge: device /dev/md124 of type ext4
dracut: zfsexpandknowledge: device /dev/sda3 of type ext4
dracut: zfsexpandknowledge: device /dev/sdb2 of type linux_raid_member
dracut: dracut module 'modsign' will not be installed, because command 'keyctl' could not be found!
dracut: dracut module 'plymouth' will not be installed, because command 'plymouthd' could not be found!
dracut: dracut module 'plymouth' will not be installed, because command 'plymouth' could not be found!
dracut: dracut module 'plymouth' will not be installed, because command 'plymouth-set-default-theme' could not be found!
dracut: dracut module 'multipath' will not be installed, because command 'multipath' could not be found!
dracut: dracut module 'biosdevname' will not be installed, because command 'biosdevname' could not be found!
dracut: dracut module 'masterkey' will not be installed, because command 'keyctl' could not be found!
dracut: zfsexpandknowledge: pool rpool has device /dev/mapper/mirrored1
dracut: zfsexpandknowledge: pool rpool has device /dev/disk/by-id/
dracut: zfsexpandknowledge: block devices backing ZFS dataset /: /dev/dm-1
/dev/disk/by-id
dracut: zfsexpandknowledge: slave block device backing ZFS dataset /: /dev/sda3
device node not found
dracut: zfsexpandknowledge: host device /dev/md124
dracut: zfsexpandknowledge: host device /dev/dm-1
dracut: zfsexpandknowledge: host device /dev/sda3
dracut: zfsexpandknowledge: host device /dev/disk/by-id
dracut: zfsexpandknowledge: device /dev/dm-1 of type zfs_member
dracut: zfsexpandknowledge: device /dev/disk/by-id of type 
dracut: zfsexpandknowledge: device /dev/md124 of type ext4
dracut: zfsexpandknowledge: device /dev/sda3 of type ext4
dracut: zfsexpandknowledge: device /dev/sdb2 of type linux_raid_member
dracut: dracut module 'modsign' will not be installed, because command 'keyctl' could not be found!
dracut: dracut module 'multipath' will not be installed, because command 'multipath' could not be found!
dracut: dracut module 'masterkey' will not be installed, because command 'keyctl' could not be found!
dracut: *** Including module: bash ***
dracut: *** Including module: dash ***
dracut: *** Including module: systemd ***
dracut: *** Including module: systemd-initrd ***
dracut: *** Including module: caps ***
dracut: caps: does not work with systemd in the initramfs
dracut: *** Including module: console-setup ***
dracut: *** Including module: aufs ***
dracut: *** Including module: kernel-modules ***

and hangs on here

If I kill dracut process /bin/bash --norc /usr/bin/dracut --kver=4.4.0-1-amd64 --force with highest PID dracut resumes its job:

/usr/lib/dracut/dracut-init.sh: line 1156: 12984 Terminated              ( instmods_1 "$@" ) 9>&1
dracut: *** Including module: mdraid ***
dracut: Skipping udev rule: 64-md-raid.rules
dracut: *** Including module: zfs ***
dracut-install: ERROR: installing '/usr/lib/systemd/system/zfs-import-scan.service'
dracut: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.lbTJDK/initramfs -a /usr/lib/systemd/system/zfs-import-scan.service
dracut-install: ERROR: installing '/usr/lib/systemd/system/zfs-import-cache.service'
dracut: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.lbTJDK/initramfs -a /usr/lib/systemd/system/zfs-import-cache.service
dracut: *** Including module: rootfs-block ***
dracut: *** Including module: terminfo ***
dracut: *** Including module: udev-rules ***
dracut: Skipping udev rule: 40-redhat.rules
dracut: Skipping udev rule: 91-permissions.rules
dracut: Skipping udev rule: 80-drivers-modprobe.rules
dracut: *** Including module: dracut-systemd ***
dracut: *** Including module: usrmount ***
dracut: *** Including module: base ***
dracut: *** Including module: fs-lib ***
dracut: *** Including module: shutdown ***
dracut: *** Including modules done ***
dracut: *** Installing kernel module dependencies and firmware ***
dracut: *** Installing kernel module dependencies and firmware done ***
dracut: *** Resolving executable dependencies ***
dracut: *** Resolving executable dependencies done***
dracut: *** Stripping files ***
dracut: *** Stripping files done ***
dracut: *** Store current command line parameters ***
dracut: Stored kernel commandline:
dracut:  rd.md.uuid=544df037:43a64f61:6c7dcd97:fb843abc 
dracut:  root=/dev/block/ rootfstype=zfs rootflags=rw,relatime,xattr,noacl
dracut: *** Creating image file '/boot/initramfs-4.4.0-1-amd64.img' ***
dracut: *** Creating initramfs image file '/boot/initramfs-4.4.0-1-amd64.img' done ***

zfs-dracut 0.6.5.6-3 (/var/lib/apt/lists/archive.zfsonlinux.org_debian_dists_jessie_main_binary-amd64_Packages) is not hangs

denizzzka commented 8 years ago

attention: this output contains strage line:

dracut: zfsexpandknowledge: device /dev/disk/by-id of type
behlendorf commented 8 years ago

This appears to be related to the as yet unmerged patch in #4478.

denizzzka commented 8 years ago

Now dracut generation is works. Thanks!

prometheanfire commented 8 years ago

So not the patches fault or did the patch fix it or something else?

denizzzka commented 8 years ago

After I compiled ~master and installed Dracut Debian package this hang is gone

denizzzka commented 8 years ago

But booting does not start to work.

Where I can give reference kernel cmdline string for booting from separate /boot with rpool stored on LUKS partition? Currently initramfs hangs on no limit waiting of start dev-block.device.

Dracut generates:

dracut: Stored kernel commandline:
dracut:  rd.md.uuid=4a99b623:38859955:38fd7be1:717a96dd 
dracut:  root=/dev/block/ rootfstype=zfs rootflags=rw,relatime,xattr,noacl

But update-grub places into grub.cfg:

menuentry 'Debian GNU/Linux' --class debian --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-e420eec9869e2ea7' {
        load_video
        insmod gzio
        insmod part_gpt
        insmod diskfilter
        insmod mdraid1x
        insmod ext2
        set root='mduuid/4a99b6233885995538fd7be1717a96dd'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='mduuid/4a99b6233885995538fd7be1717a96dd'  dc4018b0-f79f-49c3-a9e3-0539e82d0b5f
        else
          search --no-floppy --fs-uuid --set=root dc4018b0-f79f-49c3-a9e3-0539e82d0b5f
        fi
        echo    'Loading Linux 4.4.0-1-amd64 ...'
        linux   /vmlinuz-4.4.0-1-amd64 root=ZFS=rpool/ROOT/debian ro boot=zfs $bootfs  quiet
        echo    'Loading initial ramdisk ...'
        initrd  /initramfs-4.4.0-1-amd64.img
}
# dpkg -l|grep dracut
ii  dracut                                                      044+38-1                               all          Low-level tool for generating an initramfs image (automation)
ii  dracut-core                                                 044+38-1                               amd64        Low-level tool for generating an initramfs image (core tools)
ii  zfs-dracut                                                  0.6.5-215                              amd64        Dracut module

# dpkg -l|grep grub
ii  grub-common                                                 2.02-beta2.9-ZOL11-7aa9f6              amd64        GRand Unified Bootloader (common files)
ii  grub-pc                                                     2.02-beta2.9-ZOL11-7aa9f6              amd64        GRand Unified Bootloader, version 2 (PC/BIOS version)
ii  grub-pc-bin                                                 2.02-beta2.9-ZOL11-7aa9f6              amd64        GRand Unified Bootloader, version 2 (PC/BIOS binaries)
ii  grub2-common                                                2.02-beta2.9-ZOL11-7aa9f6              amd64        GRand Unified Bootloader (common files for version 2)
prometheanfire commented 8 years ago

Keep in mind this is with the patch mentioned earlier, also on Gentoo.

initrd=\initramfs-4.4.6-gentoo.img rd.luks.uuid=UUID_GOES_HERE rd.luks.allow-discards=UUID_GOES_HERE rd.luks.crypttab=0 root=zfs:POOLNAME/DATASET zfs.zfs_arc_max=3221225472 init=/usr/lib/systemd/systemd ro
denizzzka commented 8 years ago

Hm, probably LUKS isn't work with installed zfs-dracut from ~master

Here is rdsosreport.txt: http://pastebin.com/MHb2LEF9 Result of lsinitrd: http://pastebin.com/e7T3BBVy

denizzzka commented 8 years ago

root=zfs:POOLNAME/DATASET

What version of the grub is support this syntax with colon? Or it should be set manually?

prometheanfire commented 8 years ago

that's dracut (with the patch) that supports that syntax. I don't use grub at all as well

denizzzka commented 8 years ago

@prometheanfire sorry, but I have bad news: all this bug report related to https://github.com/prometheanfire/zfs ~main, not to zfsonlinux/zfs ~main.

84e15cc21acb145bf460989c3e1e37186e575a5e is in prometheanfire/zfs I just confused repositories here https://github.com/zfsonlinux/zfs/issues/4518#issuecomment-209706322

denizzzka commented 8 years ago

This appears to be related to the as yet unmerged patch in #4478.

yes

So not the patches fault or did the patch fix it or something else?

fault

prometheanfire commented 8 years ago

Oh boo, are you by anychance on irc? I'm in the #zfsonlinux channel on freenode

denizzzka commented 8 years ago

IRC investigation results:

  1. Problem is not occured with Debian dracut package version 040+1-1 and occurs with 044 version
  2. Both @prometheanfire and @Rudd-O patches are affected to this

Also @prometheanfire asked to submit another report here: initramfs / mount fails on Debian:

● sysroot.mount - /sysroot
   Loaded: loaded (/proc/cmdline; bad; vendor preset: enabled)
  Drop-In: /run/systemd/generator/sysroot.mount.d
           └─zfs-enhancement.conf
   Active: failed (Result: exit-code) since Thu 2016-04-14 05:25:27 UTC; 1min 15s ago
    Where: /sysroot
     What: rpool/ROOT/debian
     Docs: man:fstab(5)
           man:systemd-fstab-generator(8)
  Process: 660 ExecMount=/bin/mount rpool/ROOT/debian /sysroot -t zfs -o zfsutil (code=exited, status=1/FAILURE)

mount error message (for some reason it is ommited by systemd) is "unable to open dataset"

"zpool import rpool -R /sysroot" typed manually in the dracut console works fine.

prometheanfire commented 8 years ago

to clarify

Rudd-O commented 8 years ago

Important to discover:

When this unit fails in this way, what is the status of zfs-import-scan.service and zfs-import-cache.service ? The only way in which this could have failed, is if both importing units had failed to begin with (at the time sysroot.mount executes, those units should have long succeeded in importing the root pool).

Rudd-O
http://rudd-o.com/
Rudd-O commented 8 years ago

dracut-install: ERROR: installing '/usr/lib/systemd/system/zfs-import-scan.service'

That obviously should not happen.

Rudd-O commented 8 years ago

I don't have access to Debian, but someone should obviously be able to reproduce this issue by installing Debian, upgrading to Dracut 44, then installing the patched RPM set (don't forget to fix GRUB configuration! the patch alone won't do that!), then regenerating the initrds, then rebooting.

I honestly do not see how the boot fix patch can just make Dracut outright HANG when generating the Dracut images, especially not after Including module: kernel-modules -- that is a silly failure mode which seems very unrelated. We need a good strace -f log here.

Rudd-O commented 8 years ago

I think the reporter should be able to try and repro with zfs-dracut patched, with zfs-dracut unpatched, and with no zfs-dracut package installed at all. Gives us three good data points to identify what is wrong.

denizzzka commented 8 years ago

I think the reporter should be able to try and repro with zfs-dracut patched, with zfs-dracut unpatched, and with no zfs-dracut package installed at all.

"Hang bug" reproduceable only on pathed versions of zfs-dracut. Rudd-O patch output on dracut:

strace -f dracut --regenerate-all --force > ./Rudd-O_generation.log 2>&1

result: https://drive.google.com/open?id=0BxYv_ASJV7uuYnZpOHBOb01jOFE (44Mb)

Without strace:

dracut: Executing: /usr/bin/dracut --kver=4.4.0-1-amd64 --force
dracut: dracut module 'bootchart' will not be installed, because command '/sbin/bootchartd' could not be found!
dracut: zfsexpandknowledge: pool rpool has device /dev/mapper/mirrored-1
dracut: zfsexpandknowledge: pool rpool has device /dev/disk/by-id/
dracut: zfsexpandknowledge: block devices backing ZFS dataset /: /dev/dm-1
/dev/disk/by-id
dracut: zfsexpandknowledge: slave block device backing ZFS dataset /: /dev/sda3
device node not found
dracut: zfsexpandknowledge: host device /dev/md124
dracut: zfsexpandknowledge: host device /dev/dm-1
dracut: zfsexpandknowledge: host device /dev/sda3
dracut: zfsexpandknowledge: host device /dev/disk/by-id
dracut: zfsexpandknowledge: device /dev/dm-1 of type zfs_member
dracut: zfsexpandknowledge: device /dev/disk/by-id of type 
dracut: zfsexpandknowledge: device /dev/md124 of type ext2
dracut: zfsexpandknowledge: device /dev/sda3 of type ext4
dracut: zfsexpandknowledge: device /dev/sda2 of type linux_raid_member
dracut: dracut module 'modsign' will not be installed, because command 'keyctl' could not be found!
dracut: dracut module 'plymouth' will not be installed, because command 'plymouthd' could not be found!
dracut: dracut module 'plymouth' will not be installed, because command 'plymouth' could not be found!
dracut: dracut module 'plymouth' will not be installed, because command 'plymouth-set-default-theme' could not be found!
dracut: dracut module 'multipath' will not be installed, because command 'multipath' could not be found!
dracut: dracut module 'biosdevname' will not be installed, because command 'biosdevname' could not be found!
dracut: dracut module 'masterkey' will not be installed, because command 'keyctl' could not be found!
dracut: zfsexpandknowledge: pool rpool has device /dev/mapper/mirrored-1
dracut: zfsexpandknowledge: pool rpool has device /dev/disk/by-id/
dracut: zfsexpandknowledge: block devices backing ZFS dataset /: /dev/dm-1
/dev/disk/by-id
dracut: zfsexpandknowledge: slave block device backing ZFS dataset /: /dev/sda3
device node not found
dracut: zfsexpandknowledge: host device /dev/md124
dracut: zfsexpandknowledge: host device /dev/dm-1
dracut: zfsexpandknowledge: host device /dev/sda3
dracut: zfsexpandknowledge: host device /dev/disk/by-id
dracut: zfsexpandknowledge: device /dev/dm-1 of type zfs_member
dracut: zfsexpandknowledge: device /dev/disk/by-id of type 
dracut: zfsexpandknowledge: device /dev/md124 of type ext2
dracut: zfsexpandknowledge: device /dev/sda3 of type ext4
dracut: zfsexpandknowledge: device /dev/sda2 of type linux_raid_member
dracut: dracut module 'modsign' will not be installed, because command 'keyctl' could not be found!
dracut: dracut module 'multipath' will not be installed, because command 'multipath' could not be found!
dracut: dracut module 'masterkey' will not be installed, because command 'keyctl' could not be found!
dracut: *** Including module: bash ***
dracut: *** Including module: dash ***
dracut: *** Including module: systemd ***
dracut: *** Including module: systemd-initrd ***
dracut: *** Including module: caps ***
dracut: caps: does not work with systemd in the initramfs
dracut: *** Including module: console-setup ***
dracut: *** Including module: aufs ***
dracut: *** Including module: kernel-modules ***
denizzzka commented 8 years ago

dracut-install: ERROR: installing '/usr/lib/systemd/system/zfs-import-scan.service' That obviously should not happen.

@Rudd-O can it be result of dracut process killing?

denizzzka commented 8 years ago

and with no zfs-dracut package installed at all

Don't reproduces

Rudd-O commented 8 years ago

On 04/14/2016 10:21 AM, Denis Feklushkin wrote:

and with no zfs-dracut package installed at all
Don't reproduces

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/zfsonlinux/zfs/issues/4518#issuecomment-209867095

It's not the patch, it's the package, and something doesn't fit.

Please continue figuring out why it's hanging. This is not normal, and it does not hang here.

Rudd-O
http://rudd-o.com/
denizzzka commented 8 years ago

it's the package

dracut without zfs-dracut works and isn't hangs

Please continue figuring out why it's hanging.

I have no idea how to debug dracut

Rudd-O commented 8 years ago

The question is not whether "dracut without zfs-dracut" hangs or not. The question that you need to debug is whether "dracut with zfs-dracut" OR "dracut with patched-zfs-dracut" hangs. The "without" case is unimportant.

Withoult zfs-dracut, sure, nothing hangs, but you can't use ZFS on boot. If you don't use ZFS on boot, good for you, but then the bug is pointless.

Rudd-O
http://rudd-o.com/
denizzzka commented 8 years ago

I am try to use zfs as /. I just isn't know how to debug Dracut because it as Chineese language for me :(

denizzzka commented 8 years ago

Isn't reproduces with Debian packages zfs-dkms and zfs-dracut version 0.6.5.6-7