zfsonlinux / pkg-zfs

Native ZFS packaging for Debian and Ubuntu
https://launchpad.net/~zfs-native/+archive/daily
308 stars 55 forks source link

Thrown into BusyBox on bootup when ZFS is root fs (0.6.5.6-1 Debian Jessie release) #200

Closed azeemism closed 8 years ago

azeemism commented 8 years ago

This was not the case for 0.6.5.2-2 Debian Jessie release.

@FransUrbo, could this be related to the commit for https://github.com/zfsonlinux/zfs/issues/4474. Note under 0.6.5.2-2 I had no trouble also loading /usr, /var, /var/log etc. on separate ZFS datasets (https://github.com/zfsonlinux/zfs/issues/4474#issuecomment-205467437)

On every startup/reboot I see the following:

1

/# mount -o zfsutil -t zfs rpool/ROOT/debian-8 /root
/# exit

2

/# mount -o zfsutil -t zfs /root

3

/# exit

Jessie appears to load normally after this point:

root@vbox1:~# dmesg | grep ZFS
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.16.0-4-amd64 root=ZFS=rpool/ROOT/debian-8 ro boot=zfs boot=zfs rpool=rpool bootfs=rpool/ROOT/debian-8 quiet
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.16.0-4-amd64 root=ZFS=rpool/ROOT/debian-8 ro boot=zfs boot=zfs rpool=rpool bootfs=rpool/ROOT/debian-8 quiet
[    4.272120] ZFS: Loaded module v0.6.5.6-1, ZFS pool version 5000, ZFS filesystem version 5
root@vbox1:~#
root@vbox1:~# zpool status
  pool: rpool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: none requested
config:

        NAME                                       STATE     READ WRITE CKSUM
        rpool                                      ONLINE       0     0     0
          mirror-0                                 ONLINE       0     0     0
            ata-VBOX_HARDDISK_VBbbb2e13d-f1007fb3  ONLINE       0     0     0
            ata-VBOX_HARDDISK_VB44950065-9dadc8f2  ONLINE       0     0     0

errors: No known data errors
root@vbox1:~# zfs mount
rpool/ROOT/debian-8             /
root@vbox1:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=504864,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,relatime,size=811672k,mode=755)
rpool/ROOT/debian-8 on / type zfs (rw,relatime,xattr,noacl)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=22,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
/dev/sda4 on /boot type ext4 (rw,noatime,data=ordered)
/dev/sda1 on /boot/efi type vfat 
(rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=utf8,shortname=mixed,errors=remount-ro)
rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
none on /media/sf_deb8 type vboxsf (rw,nodev,relatime)
root@vbox1:~#

Note: /boot is on ext4 and both /boot and /boot/efi are on a different disk

root@vbox1:~# zdb
rpool:
    version: 5000
    name: 'rpool'
    state: 0
    txg: 115
    pool_guid: 2570617493699306475
    errata: 0
    hostid: 8323329
    hostname: 'vbox1'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 2570617493699306475
        children[0]:
            type: 'mirror'
            id: 0
            guid: 6773440328371405558
            metaslab_array: 34
            metaslab_shift: 34
            ashift: 13
            asize: 2199007002624
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 1345902431162178517
                path: '/dev/disk/by-id/ata-VBOX_HARDDISK_VBbbb2e13d-f1007fb3-part1'
                whole_disk: 1
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 7771114504148432262
                path: '/dev/disk/by-id/ata-VBOX_HARDDISK_VB44950065-9dadc8f2-part1'
                whole_disk: 1
                create_txg: 4
    features_for_read:

STEPS to reproduce the issue:

see: http://zfsonlinux.org/debian.html

Install dependencies:

# apt-get install build-essential gawk alien fakeroot linux-headers-$(uname -r)
# apt-get install zlib1g-dev uuid-dev libblkid-dev parted lsscsi wget
# apt-get install git patch automake autoconf libtool init-system-helpers
# apt-get install libselinux1-dev

Then:

# apt-get install libnvpair1 libuutil1 libzfs2 libzpool2 spl zfsutils dkms spl-dkms zfs-dkms zfs-initramfs zfsonlinux
# apt-get update
# apt-get upgrade
# apt-get install debian-zfs

Prevent disk import as sdX, keep disk import as by-id:

# nano /etc/default/zfs
ZPOOL_IMPORT_PATH="/dev/disk/by-id"

Then remake initramfs:

# update-initramfs -c -k all
# update-grub
# reboot

Create rpool and other pools:

# zpool create -f -m none \
-o ashift=13 \
-o autoexpand=on \
-o autoreplace=on \
-o feature@lz4_compress=enabled \
-O compression=lz4 \
-O sync=disabled \
-O atime=off \
-O xattr=sa \
-O com.sun:auto-snapshot=true \
-O canmount=off \
-O overlay=on \
rpool \
mirror /dev/disk/by-id/ata-VBOX_HARDDISK_VBbbb2e13d-f1007fb3 /dev/disk/by-id/ata-VBOX_HARDDISK_VB44950065-9dadc8f2
# zpool export rpool
# zpool import -d /dev/disk/by-id -N rpool
# zfs create rpool/ROOT
# zfs set canmount=off rpool/ROOT
# zfs create rpool/ROOT/debian-8
# zfs set canmount=on rpool/ROOT/debian-8

Move files to rpool:

# zpool export rpool
# zpool import -f -d /dev/disk/by-id -o altroot=/sysroot -N rpool
# zfs set mountpoint=/ rpool/ROOT/debian-8
# modprobe efivars
# grub-probe /sysroot
# rsync -axv / /sysroot/

Recreate zpool.cache that goes missing when altroot is used:

# zpool set cachefile=/sysroot/etc/zfs/zpool.cache rpool

Setup /sysroot:

# mount --bind /dev /sysroot/dev
# chroot /sysroot /bin/bash
# mount -t proc proc /proc
# mount -t sysfs sysfs /sys
# mount /boot
# mount /boot/efi
# grub-probe / 

Comment out / in /etc/fstab

Under 0.6.5.2-2 there is not need to add rpool/ROOT/debian-8 / zfs default 0 0 Whether the line above is added or not added to /etc/fstab the same issue occurs even when using zfs set mountpoint=legacy rpool/ROOT/debian-8.

Update initramfs:

# cd /boot
# mv initrd.img-3.16.0-4-amd64 initrd.img-3.16.0-4-amd64.old-pre.zfs
# update-initramfs -c -k all

Update grub and exit chroot:

# nano /etc/default/grub
GRUB_CMDLINE_LINUX="boot=zfs rpool=rpool bootfs=rpool/ROOT/debian-8"
# zpool set bootfs=rpool/ROOT/debian-8 rpool
# update-grub
# cd /
# umount /boot/efi
# umount /boot
# umount /sys
# umount /proc
# umount /dev
# exit
# zpool export rpool
# reboot
lnxsrt commented 8 years ago

Also seeing this with the new version.

FransUrbo commented 8 years ago

I'm currently looking at this.

Try booting with zfsdebug=1 and let me know what you see.

Also, in the first shell you get, please run zpool status to verify that the pool is actually imported correctly.

Btw, zfsonlinux/zfs#4474 is NOT related. That happens way, way later in the whole startup procedure. You're getting a problem in the /usr/share/initramfs-tools/scripts/zfs, which is copied onto the initrd and then used for the booting (finding and importing the pool and the root fs etc).

arcenik commented 8 years ago

The issue 4474 is about systemd but this problem occurs before systemd in the initramfs.

It appears that ZFS_BOOTFS is cleared after pool import in the script /usr/share/initramfs-tools/scripts/zfs (line 278)

zfs-initramfs-import-pool

Fix : comment the line and regenerate initramfs

# update-initramfs -u
FransUrbo commented 8 years ago

Better yet, change the "${pool}" to "${ZFS_RPOOL}".

@arcenik Does that work?

arcenik commented 8 years ago

@FransUrbo : that does not work, find_rootfs function still return nothing.

arcenik commented 8 years ago

@FransUrbo : the bootfs value for my zfs pool is "-" (which is the default value). Therefore the find_rootfs print nothing and return 1.

So initramfs script should change the ZFS_BOOTFS only if find_rootfs return 0 value

find_rootfs ${ZFS_RPOOL} && ZFS_BOOTFS="$(find_rootfs ${ZFS_RPOOL})"
FransUrbo commented 8 years ago

@arcenik Ok, thanx. That makes sence.

FransUrbo commented 8 years ago

How's this:

        # Import the pool (if not already done so in the AUTO check above).
        if [ -n "${ZFS_RPOOL}" -a -z "${POOL_IMPORTED}" ]
        then
                if import_pool "${ZFS_RPOOL}"; then
                        root_fs="$(find_rootfs "${ZFS_RPOOL}")"
                        [ -n "${root_fs}" ] && ZFS_BOOTFS="${root_fs}"
                fi
        fi
Fabian-Gruenbichler commented 8 years ago

@FransUrbo this last variant fixes this issue for me!

FransUrbo commented 8 years ago

@Fabian-Gruenbichler, perfect thanx! I'll include that in the next update. I'll hold of a little to make sure there isn't any more issues lingering and then I'll build new packages for both Wheezy and Jessie.

Expect them in five-six hours (with any luck).

arcenik commented 8 years ago

@FransUrbo instead of checking the content printed by find_rootfs you should check it's return code.

root_fs="$(find_rootfs "${ZFS_RPOOL}")"
$? && ZFS_BOOTFS="${root_fs}"
FransUrbo commented 8 years ago

@arcenik Even better, thanx!

FransUrbo commented 8 years ago

@arcenik Actually, that don't seem to work:

[celia.pts/6]$ var="$(file /etc/passwd > /dev/null 2>&1)"
[celia.pts/6]$ [ "$?" ] && echo whatever
whatever
[celia.pts/6]$ var="$(file /etc/passwdX > /dev/null 2>&1)"
[celia.pts/6]$ [ "$?" ] && echo whatever
whatever
[celia.pts/6]$ var="$(file /etc/passwdX > /dev/null 2>&1)"
[celia.pts/6]$ "$?" && echo whatever
bash: 1: command not found

However, this works:

[celia.pts/6]$ if var="$(file /etc/passwdX > /dev/null 2>&1)"; then echo whatever; fi
[celia.pts/6]$ if var="$(file /etc/passwd > /dev/null 2>&1)"; then echo whatever; fi
whatever
FransUrbo commented 8 years ago

So how's this:

        # Import the pool (if not already done so in the AUTO check above).
        if [ -n "${ZFS_RPOOL}" -a -z "${POOL_IMPORTED}" ]
        then
                if import_pool "${ZFS_RPOOL}"; then
                        if root_fs="$(find_rootfs "${ZFS_RPOOL}")"; then
                                ZFS_BOOTFS="${root_fs}"
                        fi
                fi
        fi
FransUrbo commented 8 years ago

Or possibly even better:

        # Import the pool (if not already done so in the AUTO check above).
        if [ -n "${ZFS_RPOOL}" -a -z "${POOL_IMPORTED}" ]
        then
                import_pool "${ZFS_RPOOL}" && \
                        root_fs="$(find_rootfs "${ZFS_RPOOL}")" && \
                                ZFS_BOOTFS="${root_fs}"
        fi
FransUrbo commented 8 years ago

This problem also exists a few lines above:

                OLD_IFS="${IFS}" ; IFS=";"
                for pool in ${POOLS}
                do
                        [ -z "${pool}" ] && continue

                        import_pool "${pool}"
                        ZFS_BOOTFS="$(find_rootfs "${pool}")"
                done
                IFS="${OLD_IFS}"

I'm suggesting:

                OLD_IFS="${IFS}" ; IFS=";"
                for pool in ${POOLS}
                do
                        [ -z "${pool}" ] && continue

                        import_pool "${pool}" && \
                                root_fs="$(find_rootfs "${pool}")" && \
                                        ZFS_BOOTFS="${root_fs}"
                        [ -n "${ZFS_BOOTFS} ] && break
                done
                IFS="${OLD_IFS}"
arcenik commented 8 years ago

@ FransUrbo

Here is a working example for bash:

$  a=$(true); [ $? -ne 0 ] && echo aaaaaaaaaaaa
$  a=$(false); [ $? -ne 0 ] && echo aaaaaaaaaaaa
aaaaaaaaaaaa

It also works to with busybox sh but not in initramfs.

And the working script for initramfs:

zfs-initramfs-import-pool2

FransUrbo commented 8 years ago

Still shouldn't do it if the import isn't successful, which my last examples fix.

arcenik commented 8 years ago

@FransUrbo : I'm trying your last proposal but ${POOLS} is empty on my test system

FransUrbo commented 8 years ago

It usually is... It's only in a few rare conditions when people have a root pool and a data pool (etc) that that is used. And when you specify boot=zfs zfs:AUTO and NOTHING else!

FransUrbo commented 8 years ago

Do note that there's TWO different places where almost the same code is used, and I showed solutions to both of them. Be sure to apply the correct code to the correct place.

FransUrbo commented 8 years ago

Closing this as fixed in 0.6.5.6-2 which is on it's way up to the repo now.

arcenik commented 8 years ago

Did you really tested it ??

zfs-boot-failed

FransUrbo commented 8 years ago

Did you really tested it ??

No. I don't have any machine with root on zfs available at the moment.

I'll issue new debs right away.

arcenik commented 8 years ago

You could try this : https://github.com/arcenik/debian-zfs-root ;-)

FransUrbo commented 8 years ago

You could try this : https://github.com/arcenik/debian-zfs-root ;-)

About a year or so ago, I created a native Debian GNU/Linux ISO image with native ZFS support.

I usually use that, but I don't have time to set anything up now. I might even have the original VMs around, but I don't have to find them either :).

Sorry about the mess-up.

FransUrbo commented 8 years ago

0.6.5.6-3 just pushed to the repo.

arcenik commented 8 years ago

The version 0.6.5.6-3 is working with my install script. Debian with zfs root is booting without problem.