openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.57k stars 1.74k forks source link

Newer Ubuntu doesn´t recognize pool. #14453

Closed wateenellende closed 1 year ago

wateenellende commented 1 year ago

I have a pool set up that works as expected in Ubuntu 18.04. The root filesystem is on a completely different device than the pool.

Having replaced the system disk with a fresh install of Ubuntu 22.04, the system cannot find the pool.

Whenever i put the old system disk back in, it recognizes the pool without issue again. Switch back to 22.04, and it's not visible again.

wateenellende commented 1 year ago

With zfs-fuse, there is at least an error message stating that the pool cannot be imported because it is formatted with a newer version. The recommended fix is to switch to zfsutils, however that doesn´t recognize the pool at all.

mabod commented 1 year ago

So what does a plain "zpool import" say? And what is the output of "zpool version"?

wateenellende commented 1 year ago

On the new, 22.04 system:

$ sudo zpool import -a
no pools available to import
$ zpool version
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1

Thank you for the fast response. Let me know if I can anything.

wateenellende commented 1 year ago

I've just upgraded the old system disk to 20.04, and the pool is still recognized and imported without issues. Output of "zpool version" on the old system is now:

zfs-0.8.3-1ubuntu12.14
zfs-kmod-0.8.3-1ubuntu12.14
mabod commented 1 year ago

On the new, 22.04 system:

$ sudo zpool import -a
no pools available to import
$ zpool version
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1

Thank you for the fast response. Let me know if I can anything.

Are you sure that the drives are ready and available? Do the zfs drives show up with this command: lsblk -f ?

wateenellende commented 1 year ago

The pool disk is not moving, i'm just plugging the new/old system disks in and out. Just in case, I checked, and yes, looks like it's there (sdb):

# lsblk -f
NAME            FSTYPE            FSVER    LABEL      UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                                         
├─sda1          ext4              1.0                 d5a5374d-2053-4fdc-917a-01459fbd3787    971.5M    11% /boot
└─sda2          LVM2_member       LVM2 001            5sSXMf-zl3m-vv5J-EUXN-iONF-zR0C-bDRdNi                
  ├─data-root   ext4              1.0                 ece34756-4cc1-4ded-99ad-a07c6d18c9c3    208.7G     4% /
  └─data-swap   swap              1                   785552a5-54bd-4f40-ae65-af2d92f0cd5f                  
    └─cryptswap swap              1        cryptswap  317b3dda-9561-424b-8f43-6e0ad1278049                  [SWAP]
sdb                                                                                                         
├─sdb1          linux_raid_member 1.2      XWing:data b2a241ef-5f2c-45a0-baec-6b66365d1d88                  
└─sdb9             
wateenellende commented 1 year ago

Note: this was a 2-disk mirrored pool, and one disk died.

wateenellende commented 1 year ago

Following the upgrade of the "old" system to 20.04, the pool was imported fine on boot. I then did a "zpool export". The pool can now no longer be imported on the 20.04 system either. I am at a loss.

mabod commented 1 year ago

I assume sdb is the device in question. This has the filesystem tag linux_raid_member version 1.2. This looks like an mdadm raid member. If it would be a generic zfs filesystem the output should look like this:

NAME   FSTYPE     FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                               
├─sda1 zfs_member 5000  zData 7982893580627098372                                 
└─sda9                                                                            
sdb                                                                               
├─sdb1 zfs_member 5000  zData 7982893580627098372                                 
└─sdb9                                                                            

Can ist be that your sdb device was part on an mdadm array? What is the output of "fdisk -l /dev/sdb"?

wateenellende commented 1 year ago
# fdisk -l /dev/sdb
Disk /dev/sdb: 9.1 TiB, 10000831348736 bytes, 19532873728 sectors
Disk model: ST10000VN0004-1Z
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: FE285391-62D5-914A-8EC3-BF337CACE07D

Device           Start         End     Sectors  Size Type
/dev/sdb1         2048 19532855295 19532853248  9.1T Solaris /usr & Apple ZFS
/dev/sdb9  19532855296 19532871679       16384    8M Solaris reserved 1
rincebrain commented 1 year ago

What does pvs show, and cat /proc/mdstat?

I would also suspect, if lsblk is saying linux_raid_member, that it may be something has noticed an otherwise stale or unused label on the disk and ZFS is ignoring it because it's being held by someone else.

You could check this if you like by echoing 1 to /sys/module/zfs/parameters/zfs_flags, then trying zpool import and looking at /proc/spl/kstat/zfs/dbgmsg's output - it probably has a note about trying to open that disk and why it decided not to.

(Or if not, you could try zpool import -d /dev/sdb1 and zpool import -d /dev/sdb1 -D and see if either of those knows anything...)

Also, just to be clear to anyone who stumbles in here, my supposition is that the zpool cache-based import was avoiding whatever reason zpool has now for not seeing the pool for import, so exporting it meant it's now biting you on both setups, not a version difference. :)

wateenellende commented 1 year ago
# pvs
  PV         VG   Fmt  Attr PSize    PFree
  /dev/sda2  data lvm2 a--  <237.26g    0 
# cat /proc/mdstat 
Personalities : 
unused devices: <none>
# zpool import
no pools available to import
# tail /proc/spl/kstat/zfs/dbgmsg
timestamp    message 
# zpool import -d /dev/sdb1
   pool: NAS
     id: 11106050819019747463
  state: DEGRADED
status: One or more devices contains corrupted data.
 action: The pool can be imported despite missing or damaged devices.  The
    fault tolerance of the pool may be compromised if imported.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
 config:

    NAS                       DEGRADED
      mirror-0                DEGRADED
        16604657609174159965  UNAVAIL
        sdb                   ONLINE

IT WORKED! THANK YOU!!!

Well, that was a bit premature, it's still not showing. It says it can be imported - but how ?

wateenellende commented 1 year ago

ANSWER: the following command not only listed, but also actually imported the pool NAS

# zpool import -a -d /dev/sdb1
# df 
Filesystem             1K-blocks       Used  Available Use% Mounted on
...
NAS                   9596547200 4856834560 4739712640  51% /NAS

Thank you all!!

rincebrain commented 1 year ago

You would attempt to import it with either -a or by putting the pool name after the command, but I'd strongly suggest figuring out why it wasn't listing it before before doing that.

You'd need to do echo 1 | sudo tee /sys/module/zfs/parameters/zfs_flags before I'd expect it to log much in dbgmsg - and even then, I might be wrong and it'd only log it once you actually try to import.