openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.64k stars 1.75k forks source link

Unable to specify the topmost dataset as rootfs #9107

Closed yacoob closed 5 years ago

yacoob commented 5 years ago

System information

Type Version/Name
Distribution Name Debian
Distribution Version Buster
Linux Kernel 4.19.0-5-amd64
Architecture amd64
ZFS Version 0.7.12
SPL Version 0.7.12

Describe the problem you're observing

I have a working Debian system with rootfs on ZFS. I'm trying to move / to a separate pool right now.

Old pool:

$ zfs list
NAME                    USED  AVAIL  REFER  MOUNTPOINT
lcl                    4.27T  2.75T    96K  none
lcl/sys                 254G  2.75T    96K  legacy
lcl/sys/home            168G  2.75T  6.34G  /home
lcl/sys/root           85.5G  2.75T  38.1G  /

and it boots with root=ZFS=lcl/sys/root as kernel parameter.

New pool:

lilith                 45.6M  4.77G  45.1M  /mnt/flash-oo/boot
magi                   45.4G   179G  1.68G  /mnt/flash-oo
magi/home              7.38G   179G  6.34G  /mnt/flash-oo/home
magi/root              5.64M   179G  5.64M  /mnt/flash-oo/root
magi/var               36.4G   179G  1.31G  /mnt/flash-oo/var
magi/var/lib           35.0G   179G  34.4G  /mnt/flash-oo/var/lib
magi/var/lib/docker     613M   179G   613M  /mnt/flash-oo/var/lib/docker

lilith is the boot pool, magi is the root pool. As you can see, I went with top-most dataset being the filesystem in question (/boot and / respectively). This in turn makes the rootfs specification tricky; neither root=ZFS=magi nor root=ZFS=magi/ works. I'd rather not rely on root=zfs:AUTO to control which pool is being booted, and bootfs has the same problem.

Is this configuration even supported? Or is there a (silent?) assumption that rootfs will not be the top-most dataset?

I do apologise for the horrible puns in the paths/names :D

yacoob commented 5 years ago

Judging by two threads on the mailing list (1, 2) a configuration like that is indeed discouraged. Luckily, I can still redo the new pool without much problem.

If this configuration is really discouraged, what would be a good place to add a note about it, to prevent people from following down that path? :)

johnnyjacq16 commented 5 years ago

Could you run the command zfs get all magi. root=ZFS=magi/ for grub is incorrect.

yacoob commented 5 years ago

This is on a system booted from lcl, with magi imported via zpool import -R /mnt/flash-oo magi:

# zfs get all magi
NAME  PROPERTY              VALUE                  SOURCE
magi  type                  filesystem             -
magi  creation              Wed Jul 31 19:06 2019  -
magi  used                  45.4G                  -
magi  available             179G                   -
magi  referenced            1.68G                  -
magi  compressratio         1.59x                  -
magi  mounted               yes                    -
magi  quota                 none                   default
magi  reservation           none                   default
magi  recordsize            128K                   default
magi  mountpoint            /mnt/flash-oo          local
magi  sharenfs              off                    default
magi  checksum              on                     default
magi  compression           lz4                    local
magi  atime                 on                     default
magi  devices               on                     default
magi  exec                  on                     default
magi  setuid                on                     default
magi  readonly              off                    default
magi  zoned                 off                    default
magi  snapdir               hidden                 default
magi  aclinherit            restricted             default
magi  createtxg             1                      -
magi  canmount              noauto                 local
magi  xattr                 sa                     local
magi  copies                1                      default
magi  version               5                      -
magi  utf8only              on                     -
magi  normalization         formD                  -
magi  casesensitivity       sensitive              -
magi  vscan                 off                    default
magi  nbmand                off                    default
magi  sharesmb              off                    default
magi  refquota              none                   default
magi  refreservation        none                   default
magi  guid                  13077437144030891943   -
magi  primarycache          all                    default
magi  secondarycache        all                    default
magi  usedbysnapshots       0B                     -
magi  usedbydataset         1.68G                  -
magi  usedbychildren        43.8G                  -
magi  usedbyrefreservation  0B                     -
magi  logbias               latency                default
magi  dedup                 off                    default
magi  mlslabel              none                   default
magi  sync                  standard               default
magi  dnodesize             auto                   local
magi  refcompressratio      2.02x                  -
magi  written               1.68G                  -
magi  logicalused           71.1G                  -
magi  logicalreferenced     3.05G                  -
magi  volmode               default                default
magi  filesystem_limit      none                   default
magi  snapshot_limit        none                   default
magi  filesystem_count      none                   default
magi  snapshot_count        none                   default
magi  snapdev               hidden                 default
magi  acltype               posixacl               local
magi  context               none                   default
magi  fscontext             none                   default
magi  defcontext            none                   default
magi  rootcontext           none                   default
magi  relatime              on                     local
magi  redundant_metadata    all                    default
magi  overlay               off                    default
ghfields commented 5 years ago

(Without Bpool) someone complained on the mailinglist that a root filesystem on a root dataset broke after #8052. I wrote #8356 (merged May 6) to correct that situation.

It is possible your version doesn't have this fix.

adding zfsdebug=1 to your kernel parameters might give some output that will help us track down the issue.

yacoob commented 5 years ago

@ghfields yup, I don't have that patch yet. All right, I'll lay out my new pool better, happy to see that even the edge cases are addressed :)

Thanks for the explanations!

GregorKopka commented 5 years ago

If this configuration is really discouraged, what would be a good place to add a note about it, to prevent people from following down that path? :)

Certainly in any howto regarding ZFS, since using the root dataset of a pool for anything but as a container to inherit properties into child datasets is likely to give problems later.