Open grahamc opened 3 years ago
As a bit more detail, I am curious about https://github.com/openzfs/zfs/commit/d441e85dd754ecc15659322b4d36796cbd3838de. It says:
... The ZED, which is monitoring udev events, passes the change event along to zfs_deliver_dle() if the disk or partition contains a zfs_member as identified by blkid.
however, the disk that belongs to ZFS and receives the growth notification doesn't identify as zfs_member
, although its partition does:
[root@ip-172-31-43-243:~]# blkid /dev/nvme1n1
/dev/nvme1n1: PTUUID="bee5d98c-a046-cc49-9cac-026f8320fffb" PTTYPE="gpt"
[root@ip-172-31-43-243:~]# blkid /dev/nvme1n1p1
/dev/nvme1n1p1: LABEL="tank" UUID="17154194770612704342" UUID_SUB="11351383998605785512" BLOCK_SIZE="4096" TYPE="zfs_member" PARTLABEL="zfs-7483cdb691eca6d9" PARTUUID="1b48f286-278a-e04d-8a50-46e43221766d"
it isn't clear to me (my own ignorance, my apologies) if that commit handles this case.
It looks like my best bet to getting a list of devices to pass to online -e
is:
$ zpool list -vH "$pool" | awk '($10 = "ONLINE") {print $1;}'
rpool
raidz1
ata-ST12000NM0008-2H3101_ZL001AMF
ata-ST2000DM001-1ER164_W4Z16WVT-part1
ata-ST2000DM001-1ER164_W4Z16WWV-part1
raidz1
ata-WDC_WD100EMAZ-00WJTA0_2YK1581D
ata-WDC_WD100EMAZ-00WJTA0_JEHR59DZ
ata-WDC_WD100EMAZ-00WJTA0_JEHRNYUZ
this is going to pass invalid entries (raidz1 etc.) but at least it is properly passing every whole disk and partition.
I've found another approach:
[grahamc@kif:~/.zpool.d]$ zpool status -vp -c zzzbogus rpool | awk '($2 == "ONLINE" && $6 == "THIS-IS-A-DEVICE-37108bec-aff6-4b58-9e5e-53c7c9766f05") {print $1;}'
ata-ST12000NM0008-2H3101_ZL001AMF
ata-ST2000DM001-1ER164_W4Z16WVT-part1
ata-ST2000DM001-1ER164_W4Z16WWV-part1
ata-WDC_WD100EMAZ-00WJTA0_2YK1581D
ata-WDC_WD100EMAZ-00WJTA0_JEHR59DZ
ata-WDC_WD100EMAZ-00WJTA0_JEHRNYUZ
[grahamc@kif:~/.zpool.d]$ cat zzzbogus
#!/bin/sh
echo "THIS-IS-A-DEVICE-37108bec-aff6-4b58-9e5e-53c7c9766f05"
I think what you want is:
for pool in $(zpool list -H | awk '{print $1}'); do
for vdev in $(zpool list -H -vg "$pool" | awk '($10 = "ONLINE") {print $1;}'); do
zpool online -e "$pool" "$vdev";
done
done
using the vdev GUID will avoid any ambiguity about the device name
I don't think that works out. For example:
[grahamc@kif:~]$ zpool list -H -vg "rpool"
rpool 32.7T 22.4T 10.3T - - 53% 68% 1.00x ONLINE -
13394846054374141118 5.44T 5.06T 386G - - 68% 93.1% - ONLINE
3919657051718159816 - - - - - - - - ONLINE
3484453173638860084 - - - - - - - - ONLINE
7241866703652226212 - - - - - - - - ONLINE
2902602422871981357 27.3T 17.4T 9.90T - - 51% 63.7% - ONLINE
15011291225760097991 - - - - - - - - ONLINE
3017261739652091422 - - - - - - - - ONLINE
3794566743564545168 - - - - - - - - ONLINE
[grahamc@kif:~]$ sudo zpool online -e rpool 2902602422871981357
cannot expand 2902602422871981357: operation not supported on this type of pool
[grahamc@kif:~]$ sudo zpool online -e rpool 15011291225760097991
cannot expand 15011291225760097991: no such device in pool
on IRC AllanJude suggested ZPOOL_VDEV_NAME_GUID=YES
but that doesn't appear to do it either:
[root@kif:~]# ZPOOL_VDEV_NAME_GUID=YES zpool online -e rpool 15011291225760097991
cannot expand 15011291225760097991: no such device in pool
[root@kif:~]# ZPOOL_VDEV_NAME_GUID=YES zpool online -e rpool 2902602422871981357
cannot expand 2902602422871981357: operation not supported on this type of pool
Note that the kernel and ZFS versions have bumped slightly since this ticket opened:
[root@kif:~]# uname -a
Linux kif 5.14.14 #1-NixOS SMP Wed Oct 20 09:57:59 UTC 2021 x86_64 GNU/Linux
[root@kif:~]# zfs version
zfs-2.1.1-1
zfs-kmod-2.1.1-1
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
System information
Describe the problem you're observing
My apologies for this issue, which is the result of trying to report the bug of online EBS volume resizing not causing autoexpand to work. It seems I've potentially uncovered a handful of bugs in the process.
I have an AWS EC2 AMI with two EBS volumes:
/boot
, a partitioned disk formatted as FAT for /boot/
The AMI is created with a 2G root disk, and then spawned as an EC2 instance with a 9G root:
I set autoexpand on the pool as part of creating the AMI:
I expected the zpool to autoexpand on startup / import, but it doesn't seem to do that.
Having looked in to this before with the zpool being on a partition, I realized it might be because udev and zed typically trigger autoexpand. Hoping this was the case, I live-expanded the EBS volume while the instance was running.
After I initiated the expansion in the AWS console, I saw kernel events in
journalctl -f
:I also monitored udev for events and saw:
However, my pool is still not expanded.
My next inclination was to have a oneshot systemd unit expand the disks on startup, however this doesn't seem to be straight forward either: I think I need to manually enumerate every disk that may be expandable.
I have this snippet that I've used in the past when hosting ZFS pools on partitions:
but this fails when entire disks are dedicated, because
-P
seems to erroneously resolve the path to a disk to the first partition:and
zpool online -e
chokes on that:and now I'm feeling stuck and unsure about a reliable and generic way to cause autoexpand to happen on these machines.
Describe how to reproduce the problem