Closed andyprice closed 9 months ago
I think this should be relatively simple to implement, most of the changes will be in blivet (the storage library the role uses).
The proposal is to add a new
shared: (true|false)
option for volumes to abstract this functionality in the storage role.
So would this be for the the volume (in storage role this is the LV) or for the pool (for us this is the VG). If I understand it correctly the VG itself is shared so we'll simply create all volumes (LVs) with the --activate sy
option if the VG is shared, so I think it makes sense to add the shared
option on the pool (VG) level. Or is it possible to have some LVs in the VG "not shared"?
Few additional questions:
shared
option doesn't "match" the existing VG -- so if user sets shared: true
but the VG was not created as shared (or vice versa). Is this simply a user error or can we "convert" the VG to shared (a do we want to support this case)?vgchange --lock-stop
when deactivating the VG (for example before removing it)?Or is it possible to have some LVs in the VG "not shared"?
It is possible for an LV in a shared VG to be activated in exclusive mode so that only the first activating node in the cluster can use it. It might make sense to have a shared
option for the pool and an optional activation
option for the volume which defaults to ay or sy depending on whether the vg is shared. For the gfs2 role, we only use shared LVs.
The "LV activation" section in this doc describes the options: https://man7.org/linux/man-pages/man8/lvmlockd.8.html
What should we do if the VG already exists, but the
shared
option doesn't "match" the existing VG -- so if user setsshared: true
but the VG was not created as shared (or vice versa). Is this simply a user error or can we "convert" the VG to shared (a do we want to support this case)?
In that case there's an assumption in the playbook that doesn't match how the shared storage is being used in the cluster. I would treat it as an error to be safe.
Do we also need to call
vgchange --lock-stop
when deactivating the VG (for example before removing it)?
As far as I know, removal has to be done in this order:
vgchange --activate n VG/LV
on all cluster members [edit: this should have been lvchange, sorry]lvremove VG/LV
on one membervgchange --lock-stop VG
on all-but-one membersvgremove VG
on the remaining memberPerhaps we could loop in @teigland on this to check my assertions.
It is possible for an LV in a shared VG to be activated in exclusive mode so that only the first activating node in the cluster can use it. It might make sense to have a
shared
option for the pool and an optionalactivation
option for the volume which defaults to ay or sy depending on whether the vg is shared. For the gfs2 role, we only use shared LVs.
all the possible options:
In a local VG (not shared), the e|s characters are ignored, and all activation is -ay. In a shared VG, both -ay and -aey mean exclusive, and only -asy means shared.
As far as I know, removal has to be done in this order:
vgchange --activate n VG/LV
on all cluster memberslvremove VG/LV
on one membervgchange --lock-stop VG
on all-but-one membersvgremove VG
on the remaining member
Right, pick one node to do lvremove and vgremove. That node would skip the lockstop which is built into vgremove.
Right, pick one node to do lvremove and vgremove. That node would skip the lockstop which is built into vgremove.
So I guess we'll just assume that "someone else" did the lockstop calls and we are only going to do a standard lvremove
/vgremove
calls.
For the gfs2 role, that sounds fine. The HA cluster resources will manage the locks in the normal case and we don't support removing the volume groups in the role because that's a destructive operation that the user should consider carefully.
In the new gfs2 role we idempotently create LVs and set them up for shared storage using community.general.* modules to set up PVs as normal and then:
--shared
option tovgcreate
vgchange --lock-start <VG>
and--activate sy
option tolvcreate
We would like to use the storage role for this purpose instead, to avoid bundling modules from community.general into linux-system-roles. The storage role currently does not provide a way to use these options.
The proposal is to add a new
shared: (true|false)
option for volumes to abstract this functionality in the storage role.Step 2 is required for step 3 to work, but if step 2 cannot be implemented in the storage role, it should be sufficient for steps 1 and 3 to be supported separately so that the gfs2 role can run step 2 itself.