container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.34k stars 373 forks source link

NodeUnpublishVolumeRequest - Why is there no "secrets" parameter, as there is in NodePublishVolumeRequest? #568

Closed koreno closed 2 months ago

koreno commented 2 months ago

https://github.com/container-storage-interface/spec/blob/2f340622302a8d26950aba0f73f240ea1132a67b/spec.md?plain=1#L2501

There's a suprising and difficult asymmetry between the NodePublish/NodeUnpublishVolumeRequest - The Publish RPC can get a secrets, which use to gain access to the CSI driver's backend, however the Unpublish does not. The Unpublish is where we want to send a request to the backend to delete the data. All of this is in the context of EphemeralVolumes, which means that there is no call to ControllerUnpublishVolume. Is there a canonical solution for this that we're missing?

Any help would be appreciated.

bswartz commented 2 months ago

Not including secrets for unpublish/unstage is intentional because the CO expects to be able to clean up its volume attachment state even if it loses access to the necessary secrets.

SPs that require the secrets in order to clean up should write them to the filesystem at publish/stage time so they can be retrieved at unpublish/unstage time. For security reasons, a tmpfs accessible only to root should be used.

koreno commented 2 months ago

is there a known tmpfs volume mounted into the CSI driver that should be used?

bswartz commented 2 months ago

It's possible to create a subdir under /tmp, which is usually a tmpfs. But node plugins have root so they can mount their own tmpfs in a place of their choosing. I've done it both ways.

vlboiko commented 2 months ago

@bswartz what about host machine restarts? any folder under /tmp doesn't survive restarts right?

bswartz commented 2 months ago

Correct, and when the host machine restarts there won't be any attached volumes or mounts to worry about. The node-plugin just needs to be able to verify this fact by noticing all its mounts are gone, and return successfully to NodeUnpublishVolume and NodeUnstageVolume.

vlboiko commented 2 months ago

@bswartz Yes. But our driver still need to delete orphaned resources on nfs server side. That is why we need to pass secret from NodePublishVolume to NodeUnpublishVolume. Despite of mount exist or not we need to ensure we deleted all remote resources

bswartz commented 2 months ago

I would suggest that the node should not be responsible for cleaning up anything on the server side. It's called the node plugin for a reason. If you have resources that must be cleaned up on the server I strongly suggest doing that work on the controller side. ControllerUnpublishVolume is the right place to clean up remote resources.

vlboiko commented 2 months ago

I'm working on Ephemeral Volumes. Unfortunately, EV doesn't trigger any of the controller's endpoints.

bswartz commented 2 months ago

Okay I see why that's a problem. Sounds like you maybe need to redesign the plugin. If it's truly essential to maintain state on the NFS server, perhaps lazy garbage collection is the way to go.