olljanat / csi-plugins-for-docker-swarm

MIT License
46 stars 8 forks source link

Add DigitalOcean CSI #15

Closed olljanat closed 1 year ago

olljanat commented 1 year ago

Draft version of adding DigitalOcean CSI packaging example to here.

Currently fails on https://github.com/digitalocean/csi-digitalocean/blob/v4.5.0/driver/controller.go#L267-L269 is which is most probably bug in Swarmkit because based on CSI spec NodeID is mandatory parameter on PublishVolume requests https://github.com/container-storage-interface/spec/blob/release-1.7/csi.proto#L730-L732

Reason why Swarmkit does not handle this correctly most probably is that most of the CSI drivers takes NodeID as parameter like this https://github.com/olljanat/csi-plugins-for-docker-swarm/blob/f05fbab91c32a29dcf14f93d107501ab499af97a/csi-driver-smb/entrypoint.sh#L8 which is not the case in DigitalOcean implementation.

s4ke commented 1 year ago

Interestingly enough i had errors like this when packaging for hetzner as well that seem to have resolved themselves by restarting/reinstalling and "starting from scratch". Can you try that as well? Seems like there is some state in docker csi that is not cleaned up.

Afaik the nodes identify themselves via the identity service. These node ids should then be passed to this Controller just fine iirc.

olljanat commented 1 year ago

Good tip but does not look to be about that.

s4ke commented 1 year ago

Okay, I guess you have tested this already, I just wanted to give some insight. I was already hunting through the code to debug things and I got it working somehow.

Another thing that definitely has caused me issues is not having the plugin with the controller/node installed on all managers.

olljanat commented 1 year ago

I mean, I tested that immediately after your tip by removing /var/lib/docker and rebuilding plugin. Currently using one node only so that is not also the case.

Anyway as far I can see at least https://github.com/moby/swarmkit/blob/2ad26e5531a20bcbdc099bf3bdbabd01c1406e84/manager/csi/plugin.go#L200-L219 should have check that if nodeID is empty request should not even been send to CSI plugin but still need to figure out why that is the case.

s4ke commented 1 year ago

I agree with your idea of having swarmkit checking this. This will remove some guess work

olljanat commented 1 year ago

@s4ke btw. Does those plugins which with you have tested have PUBLISH_UNPUBLISH_VOLUME capability? Looks that at least those which I have used earlier don't have that so PublishVolume function does not actually do anything: https://github.com/moby/swarmkit/blob/2ad26e5531a20bcbdc099bf3bdbabd01c1406e84/manager/csi/plugin.go#L203-L206

DigitalOcean plugin have that capability and that logic in PublishVolume function is broken.

s4ke commented 1 year ago

The plugin I worked on has these capabilities:

controller: https://github.com/hetznercloud/csi-driver/blob/main/driver/controller.go#L305

node: https://github.com/hetznercloud/csi-driver/blob/main/driver/node.go#L159

does this help?