Closed sonicaj closed 2 years ago
Hi. Can someone please point a direction if I am perhaps doing this the wrong way ? 🤔
Any help would be much appreciated. Thank you
While this could be a bug, the mailing list is for user support questions.
@drescherjm this looks like a bug that's why posted it as an issue :)
This works nicely on FreeBSD, and it broke for me when making the shift to zol
It sounds like this only occur for the by-partuuid
paths, and not the other by-*
paths. What I'd suggest checking is that the uuid_of_partition
remains the same after running sgdisk
and partprobe
. When doing the zpool expand -e
the device will need to be closed and reopened. If the path to the device has changed due to a new uuid it will be considered unavailable.
@behlendorf i ensured that the uuid of the partition in question remains same before and after the sgdisk
operation as indeed it wouldn't be available to ZFS otherwise. If there's anything else you would like me to let you know, please feel free to ask. Thank you for your response
@sonicaj after the failure can you please check the /proc/spl/kstat/zfs/dbgmsg
log. It should provide some additional information about why the device could not be reopened.
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 create pool version 5000; software version unknown; uts localhost 5.2.0-3-amd64 #1 SMP Debian 5.2.17-1 (2019-09-26) x86_64
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@async_destroy=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@empty_bpobj=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@lz4_compress=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@multi_vdev_crash_dump=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@spacemap_histogram=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@enabled_txg=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@hole_birth=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@extensible_dataset=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@embedded_data=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@bookmarks=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@filesystem_limits=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@large_blocks=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@large_dnode=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@sha512=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@skein=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@edonr=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@userobj_accounting=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@encryption=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@project_quota=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@device_removal=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@obsolete_counts=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@zpool_checkpoint=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@spacemap_v2=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@allocation_classes=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@resilver_defer=enabled
1579278631 spa_history.c:319:spa_history_log_sync(): txg 4 set feature@bookmark_v2=enabled
1579278631 mmp.c:249:mmp_thread_start(): MMP thread started pool 'failed_pool' gethrtime 25110035470858
1579278631 spa_history.c:306:spa_history_log_sync(): command: zpool create -f failed_pool /dev/disk/by-partuuid/6b1673f3-1b0a-42ca-9d31-97bdede50948
1579278687 metaslab.c:2667:metaslab_condense(): condensing: txg 29161, msp[8] ffff9a2bcb0b1800, vdev id 0, spa freenas-boot, smp size 16488, segments 503, forcing condense=FALSE
1579278706 vdev.c:124:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/6b1673f3-1b0a-42ca-9d31-97bdede50948': open error=2 count=100
1579278706 spa.c:7577:spa_async_request(): spa=failed_pool async request task=1
1579278706 zio.c:3479:zio_dva_allocate(): failed_pool: metaslab allocation failure: zio ffff9a2bd8fbea90, size 1024, error 28
1579278706 zio.c:3479:zio_dva_allocate(): failed_pool: metaslab allocation failure: zio ffff9a2bd2b3cd80, size 2048, error 28
1579278706 zio.c:3479:zio_dva_allocate(): failed_pool: metaslab allocation failure: zio ffff9a2bd2b3a6c0, size 1024, error 28
1579278706 zio.c:3479:zio_dva_allocate(): failed_pool: metaslab allocation failure: zio ffff9a2bd2b3c3d0, size 512, error 28
Kindly let me know if you would like more output. This is from the end. Before this, it seemed more old before I tried the reproduction case again.
I think sgdisk
deletes the partition entry and then recreates it with same partition uuid. Do you think the delay in between causes this ZFS behaviour ?
It's definitely possible. The reopen logic expects the symlink to be removed and recreated and will wait for up to a full second for this to happen if needed. Here's the relevant line from the log. After trying to open /dev/disk/by-partuuid/6b1673f3-1b0a-42ca-9d31-97bdede5094
repeatedly for a full second, and always failing with ENOENT, the reopen was allowed to fail.
1579278706 vdev.c:124:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/6b1673f3-1b0a-42ca-9d31-97bdede50948': open error=2 count=100
But why doesn't that happen with other by-*
paths ? I mean when sgdisk removes the partition entry, it should be removed from say /dev/sdb2
as it no longer exists or maybe linux holds on to that link for a while ? Btw in FreeBSD this works nicely. Do you think this is something zol can incorporate ( handling this case gracefully ) ?
But why doesn't that happen with other by-* paths?
That's a good question. ZoL doesn't treat the other by-*
paths any differently, so I'm not sure why you would only see this with by-partuuid
. It may be there's something different with the udev rules which generate these links which causes the issue. You may be able to get a better idea of what's going on by using udevadm monitor
to get better visibility in to how the rules are executed.
If you're comfortable rebuilding ZFS from source you could increase the allowed timeout to see if waiting longer would help. Or if there's some other issue. If you clone the master source from Github you would need to increase the zfs_vdev_open_timeout_ms
value in vdev_disk.c
.
/*
* Wait up to zfs_vdev_open_timeout_ms milliseconds before determining the
* device is missing. The missing path may be transient since the links
* can be briefly removed and recreated in response to udev events.
*/
static unsigned zfs_vdev_open_timeout_ms = 1000;
The way each platforms handles blocks devices is very different. We've made the code as similar as possible but FreeBSD and Linux each have their own challenges.
@behlendorf I'll try this weekend with the said fix and see if I still get into the issue. I am thinking of changing it to 3-4 seconds, the time more or less taken by sgdisk. Will let you know how it goes, thank you
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
@behlendorf thank you for looking into it, i was testing this on latest debian bullseye version and i have not been able to reproduce it anymore. Closing it.
System information
Describe the problem you're observing
Basically my objective is to increase a partition size for disks I have in a pool and make sure ZFS registers them nicely. I followed the following steps to do this:
The above steps worked very nicely and ZFS successfully got the updated partition size. Moving on, I tried following steps which basically only differ on how I created the pool
The last
zpool online
command gets stuck forever. If I seezpool status
on another tab, I seeCan someone please advise if I am handling this the wrong way ? It works nicely when I specify partition path, but doesn't when I use partition uuid.
Also I see similar results when
autoexpand
ison
Describe how to reproduce the problem
1) Format the disk with GPT partitions 2)
zpool create failed_pool /dev/disk/by-partuuiud/uuid_of_partition
3)sgdisk -d 2 -n 2:0:0 -t 2:BF01 -u 2:existing_uuid_of_partition /dev/sdb
( please edit disk path and partition no to reflect yours ) 4)partprobe /dev/sdb
5)zpool online -e failed_pool GUID_of_device
Looking forward to hearing from you guys, thank you!