container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.34k stars 373 forks source link

Runtime Assisted Mount and Management enhancements #526

Open ddebroy opened 2 years ago

ddebroy commented 2 years ago

CSI spec enhancements to support deferral of file system mount and management operations from CSI plugins to container runtime handlers (detailed in https://github.com/kubernetes/enhancements/pull/2893)

This does not specify how CSI plugins defer file system operations to a container runtime handler. Please see https://github.com/kubernetes/enhancements/pull/2893/files#diff-d29b488a06a8e4aa819829e0634c0affc40f4c4a35fa22738a7d02039ca50128R1281 for an outline of a common protocol that will allow CSI plugins and container runtime handlers to coordinate mount and management operations for file systems.

The main use-case of this right now is microvm based runtimes like Kata which benefit (from a perf and security perspective) when controlling the mounting of file systems on volumes (backed by block devices) rather than having the mount managed in the host and projected to the sandbox environment (using 9p or virtio-fs)

Existing CSI plugins in Kubernetes environments are achieving the above by performing API server lookups of the pod spec and associated runtime class (e.g. https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/ddef5e99b24da6e07a21184803bd37c16ade5686/pkg/disk/nodeserver.go#L241). This enhancement will allow the relevant information to be passed to the plugin without requiring the API server lookup of pod details.

saad-ali commented 1 year ago

@jdef @bswartz Can you help review this?

ddebroy commented 1 year ago

@bswartz the overall flow for this will involve:

  1. For NodePublish, SPs calling APIs on a common proxy (CRuSt proxy) to deposit metadata files about the file system operations for the volume in a known location on the host (that runtime handlers can process)
  2. For NodeGetVolumeStats and NodeExpandVolume, SPs calling APIs on the common proxy (CRuSt proxy) to retrieve the data from a runtime handler.

Does the common proxy model work? If so, should I clarify the details about that in the CSI spec? The specifics about the proxy and how CSI plugins and runtimes will interact with it is described in the KEP here. Happy to incorporate a summary of it here if it is acceptable.

bswartz commented 1 year ago

@bswartz the overall flow for this will involve:

1. For NodePublish, SPs calling APIs on a common proxy ([CRuSt](https://github.com/kubernetes/enhancements/pull/2893/files#diff-d29b488a06a8e4aa819829e0634c0affc40f4c4a35fa22738a7d02039ca50128R1081) proxy) to deposit metadata files about the file system operations for the volume in a known location on the host (that runtime handlers can process)

2. For NodeGetVolumeStats and NodeExpandVolume, SPs calling APIs on the common proxy (CRuSt proxy) to retrieve the data from a runtime handler.

Does the common proxy model work? If so, should I clarify the details about that in the CSI spec? The specifics about the proxy and how CSI plugins and runtimes will interact with it is described in the KEP here. Happy to incorporate a summary of it here if it is acceptable.

Is it not possible to limit the communication that the SP has to do to just the CSI RPCs? I'm concerned that by mandating reliance on another standard, this feature ends up being limiting rather than empowering. For example, rather than the SP talking to CRuSt, couldn't the SP return the relevant info to the CO and then the CO could pass it to CRuSt? Such a model would enable other COs to do smarter things and possibly evolve beyond CRuSt. If we tie this feature to CRuSt and CRuSt stalls at some point in the future, then CSI is tied to a dead or dying thing. I'm thinking 5-10 years into the future here.

I realize that it might add to the complexity of the CSI RPCs to have special flavors of the relevant functions for the defered-FS calls, but extra RPCs are less of a liability than mandating usage of another standard. I've only started reading the KEP, but this is my main concern and I'm hoping there's a way to structure the layers so that CSI remains self-contained and the new interactions are pushed up into the CO (kubelet in this case).

ddebroy commented 1 year ago

rather than the SP talking to CRuSt, couldn't the SP return the relevant info to the CO and then the CO could pass it to CRuSt? Such a model would enable other COs to do smarter things and possibly evolve beyond CRuSt.

This is a great suggestion and makes complete sense! Thanks, @bswartz. Let me explore that and adapt this PR (and the KEP) to the suggested flow.

ameade commented 1 year ago

I almost had the same question as bswartz but looks like yall are thinking about it. I was thinking instead of the SP "deferring" the FS operations by calling a proxy that does it, perhaps it could be more like "skip" FS operations, then have the CO do the FS operations, perhaps by way of CRuSt. That might work since I assume the FS operations would be the last step and any of these operations anyways (NodeExpandVolume, NodeStageVolume, etc).

xing-yang commented 1 year ago

This file is missing: lib/go/csi/csi.pb.go It should be generated automatically.