Open danderson opened 2 years ago
Another way to screw yourself is to dd disk to another. In that case labels will be the same, but the serial numbers will be different. I wonder how ZFS will handle this case as well :)
This raises a question - which data should be primary indicator for ZFS: zfs labels or device serial numbers. IMHO serial numbers should usually be more reliable than labels and I wouldn't be surprised if ZFS prefers them in case of uncertainty.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
System information
Describe the problem you're observing
My offsite backup storage is a VM that has whole physical drives mapped into it. Due to (I presume) a hypervisor configuration error on the host machine, all 3 drives in my array report the same device serial number:
I can create a zpool with these devices just fine:
The dataset works fine at this point: reads, writes, scrubs, all happy. However, upon reboot:
Still fine, but note that vdd has now taken on its serial number name in zpool output. In the race to own the /dev/disk/by-id symlink for the shared serial, vdd won on this particular boot. Even more worrying, linux seems to have let different drives win the race for different partition symlinks:
Notice that the device and part1 symlink point at vdd, but the part9 symlink points at vdb.
Finally, a few more reboots until a different drive wins the serial number ownership battle, and:
With even more reboots, I can get vdc to win the race, at which point the pool goes FAULTED and all hope is lost.
Now, this is obviously a very silly way to run a zpool, and the immediate fix is "don't share serial numbers between drives". However, I was surprised that this confusion was enough to break ZFS, as I'd have expected each device to have a label that would have let ZFS tell them apart and untangle the renaming confusion.
Describe how to reproduce the problem
Unsure as to how exactly the VM is configured (trying to get the libvirt config from my host now), but roughly:
zpool status
until the array goes DEGRADED due to device confusion.Include any warning/errors/backtraces from the system logs
Checked dmesg and system journal, zfs logged nothing in either.