Open webdock-io opened 1 week ago
According to https://docs.oracle.com/cd/E19253-01/819-5461/gbcve/
The state can be one of the following: ONLINE, FAULTED, DEGRADED, UNAVAIL, or OFFLINE. If the state is anything but ONLINE, the fault tolerance of the pool has been compromised.
The offending error is from here:
And as OP says, is caused by this filter for ONLINE:
@simondeziel this function is used to get the parent blocks in order to calculate shared disk limits to apply.
In this case I think its safe to allow degraded pools to still be considered as parents for disk limits.
What do you think? Are there any other states we should also consider?
In this case I think its safe to allow degraded pools to still be considered as parents for disk limits.
Agreed.
What do you think? Are there any other states we should also consider?
Looking at the other possible states on https://openzfs.github.io/openzfs-docs/man/master/7/zpoolconcepts.7.html#Device_Failure_and_Recovery, I think you are right that only ONLINE
and DEGRADED
should be considered OK.
LXD v5.21.1 LTS Ubuntu Noble
If a
zpool status
reports your pool as anything but "ONLINE" lxd will fail with this error. This does not seem to have been the case in the past, as I'm sure we've had degraded pools and did not see this before. Looking at the LXD source it seems like you are explicitly matching the string "ONLINE" against the output from zpool status, and if not found you hard fail.This is incorrect behavior. A degraded pool does not mean it is non-functional and that lxd should not proceed.