Closed megakid closed 6 months ago
Hi @megakid! Thank you for your contribution of this much sought after feature! đŸ¥‡
It will take some time for the team to test and review this PR. If everything checks out we're going to have you sign the PR as per the DCO failure. https://github.com/hpe-storage/csi-driver/pull/388/checks?check_run_id=22447297357
The next release of the CSI driver is most likely 4-6 months and we have our hands full right now with customer facing features, so do don't get discouraged by the PR sitting here for a while.
Yep no worries - worth noting that we've been running this fork utilising the topology features in our test clusters for about 5-6 months now - we've mounted over 100k clones in that time!
Ok, I've spent some time on this and have a couple of questions. I'm delighted to see your branch builds and pass both manual testing and the CSI e2e tests so hopefully we can get this merged sooner rather than later.
this does not add full StatefulSet automatic provisioning across multiple HPE devices which is a bigger task
I'm curious to if you have given any thought to how this can be implemented? I have my own vague ideas of a "topology aware secret" or a "meta CSP" that acts like a proxy. I'd love to hear your thoughts/opinion on how this would ideally be designed and implemented.
We also require the PR to pass the DCO and have the commit signed.
Just checking in here, did you get our request to sign the DCO and review the comment above? @megakid
Hi @datamattsson - thanks for reviewing, I think I've fixed up all the items you have flagged.
In terms of thoughts behind more full topology support:
I think adding a top level to the secret structure (e.g. 1 key per device) is probably the best way, adding a topology label selector to each object. The driver could read this single secret (as it does now) and the top level key names would indicate the name of the devices. This would mean keeping the CSP svc simple 1-to-1 with devices (as it is now).
Region/zone can be layered, ideally we can just use some yaml-defined match label expressions to define these in a flexible way but I think there probably needs to be some hardcoded 2 or 3 tier of labels (or at least ones that rarely change) given they need to be copied from Node
-> CSINode
by the driver.
You can then have a single StorageClass
that refers to this secret and allows multi-device deployments such as StatefulSets (they would node anti-affinitize on csi.hpe.com/zone - unsure if this is assumed behaviour across k8s/other CSI drivers or needs to be specified in the StatefulSet spec). I suspect you might want to have a default device specified either in the StorageClass spec or as an extra property within each element of the Secret (as below)
Current
apiVersion: v1
kind: Secret
metadata:
name: hpe-backend
namespace: hpe-storage
stringData:
serviceName: alletrastoragemp-csp-svc
servicePort: "8080"
backend: 10.10.0.20
username: 3paradm
password: 3pardata
Proposed
apiVersion: v1
kind: Secret
metadata:
name: hpe-backend
namespace: hpe-storage
stringData:
dc1Device:
serviceName: alletrastoragemp-csp-dc1-svc
servicePort: "8080"
backend: 10.10.0.20
username: 3paradm
password: 3pardata
defaultDevice: true
topologySelector:
matchLabelExpressions:
- key: csi.hpe.com/zone
values:
- dc1
dc2Device:
serviceName: alletrastoragemp-csp-dc2-svc
servicePort: "8080"
backend: 10.20.0.20
username: 3paradm
password: 3pardata
topologySelector:
matchLabelExpressions:
- key: csi.hpe.com/zone
values:
- dc2
I think adding a top level to the secret structure (e.g. 1 key per device) is probably the best way, adding a topology label selector to each object. The driver could read this single secret (as it does now) and the top level key names would indicate the name of the devices. This would mean keeping the CSP svc simple 1-to-1 with devices (as it is now).
Thanks for your thoughts on this. I like it. Although, I think Secrets and ConfigMaps only allow key/value pairs so I think we need to parse the Secret differently in the driver.
apiVersion: v1
kind: Secret
metadata:
name: hpe-backend
stringData:
topology: |
dc1Device:
serviceName: alletrastoragemp-csp-dc1-svc
servicePort: "8080"
backend: 10.10.0.20
username: 3paradm
password: 3pardata
defaultDevice: true
topologySelector:
matchLabelExpressions:
- key: csi.hpe.com/zone
values:
- dc1
dc2Device:
serviceName: alletrastoragemp-csp-dc2-svc
servicePort: "8080"
backend: 10.20.0.20
username: 3paradm
password: 3pardata
topologySelector:
matchLabelExpressions:
- key: csi.hpe.com/zone
values:
- dc2
volumeBindingMode: WaitForFirstConsumer
will allow the CSI driver to filter out a suitable backend in the request and somehow the "device" key needs to be tacked on the PV so subsequent operations to the PV can be mapped in the Secret.
I've seen other vendors have CRs for their backend definitions and that would surely be a more clean approach.
@dileepds and @sijeesh could you have a look?
These changes add basic topology support to HPE CSI Driver.
They expose a new plugin capability:
PluginCapability_Service_VOLUME_ACCESSIBILITY_CONSTRAINTS
This allows you to attach
allowedTopologies
to aStorageClass
and it will handle the rest - including ensuring that Kubernetes will schedule your pods onto the correct nodes etc.The benefit here is you can avoid setting pod affinities etc and just rely on the storage class choice to schedule your nodes correctly. We have a cross-datacenter cluster with access to two HPE devices, one per DC.
Note: this does not add full StatefulSet automatic provisioning across multiple HPE devices which is a bigger task.
I should add that to make this work, it also needs Topology feature gate added to the
csi-provisioner
container args:e.g.
A storage class:
A node label:
After this is setup, you get the topologyKeys set on the CSINode automatically: