container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.35k stars 373 forks source link

Secrets field in ListSnapshotsRequest #370

Closed hakanmemisoglu closed 4 years ago

hakanmemisoglu commented 5 years ago

Right now, CSI specification defines a secrets field in CreateSnapshotRequest to pass credentials around.

However, ListSnapshotsRequest does not have this field, meanwhile listing snapshots might require some credentials with read access to work properly.

I think we need an optional secrets field in ListSnapshotRequest.

j-griffith commented 5 years ago

Do you have an example use case of why list snapshots would need/want secrets? We have the parameter for create volume and snapshot as it's pretty common to need special credentials to create resources on a plugin. I'm not too familiar with restrictive list operations though. Would it be possible to use a reference in the plugin from object creation?

If this is something that's needed it has to be considered for ListVolumesRequest as well.

travisghansen commented 5 years ago

I'm currently planning/designing a csi driver for FreeNAS (and likely various other zfs based storage systems). In an effort to remain as stateless as possible (hopefully totally stateless) I think secrets should be passed not only to this method but any other controller methods that don't currently receive them.

I need to know which appliance I'm dealing with and how to communicate with it (ie: creds).

GetCapacityRequest appears to be missing secrets. ListVolumesRequest appears to be missing secrets.

ControllerGetCapabilitiesRequest is probably the only exception.

hakanmemisoglu commented 5 years ago

Ceph CSI driver is another one. It doesn't support ListSnapshots because there is no easy way to pass cluster credentials to the method since they switched to a stateless implementation.

xing-yang commented 5 years ago

@hakanmemisoglu Do you also need secrets to support ListVolumes for the Ceph CSI driver? If secret is needed to list snapshots, it should be needed to list volumes as well.

xing-yang commented 5 years ago

I thought only write operations need secrets. List snapshots/volumes are read operations. I didn't expect that they need secrets as well.

hakanmemisoglu commented 5 years ago

@xing-yang, yes you are right. We can include ListVolumes too.

I think all of the read operations might need read credentials depending on the SP. It includes ListVolumes as you noted and maybe others, like GetCapacity as @travisghansen commented.

shay-berman commented 5 years ago

Yes, also IBM storage requires secret to list snapshot, like every csi controller operations that communicates with the storage.

j-griffith commented 5 years ago

I'm currently planning/designing a csi driver for FreeNAS (and likely various other zfs based storage systems). In an effort to remain as stateless as possible (hopefully totally stateless) I think secrets should be passed not only to this method but any other controller methods that don't currently receive them.

I need to know which appliance I'm dealing with and how to communicate with it (ie: creds).

GetCapacityRequest appears to be missing secrets. ListVolumesRequest appears to be missing secrets.

ControllerGetCapabilitiesRequest is probably the only exception.

So you're proposal is that a secrets parameter be added to every csi request? I see what you're getting at now, storage/cluster creds. I think a number of folks are doing that now by providing credentials to the plugin startup then relying on namespace to filter out access. Anyway, I see what you're aiming at now I think.

travisghansen commented 5 years ago

@j-griffith I'm proposing they be added to all requests for the Controller service. Other use-cases may depend on them for the Node services but I'm guessing that's less an issue for most people.

I think you're on to the design idea yes. I don't really want to deploy the creds with the driver at all, as that requires me to know in advance what appliances may be used etc, etc. Using kubernetes parlance I prefer to store all those details in the StorageClass and each class includes api endpoints/ip address(es), username, password, ssh key, etc that may be required to integrate the given appliance.

Ideally the appliances are fully dynamic from the perspective of the csi driver and do not require re-deploying the driver to support new appliances. The driver simply becomes a conduit for various/appropriate calls but is provided all the details from 'external' configuration.

I do this currently with the crude non-csi provisioner I built over a weekend a year ago (be gentle if you review the code as I also taught myself go over the same weekend). You may be able to get the general idea by reviewing 2 files:

Each class references a given secret and each secret has specific server details. To reiterate, no specific configuration is then deployed with the provisioner itself. When the various requests go into the provisioner the provisioner simply parses the secret name from the class (which is part of the various method inputs), uses the kubernetes api to retrieve the secret and then builds an http client/etc with those details for that specific request. In the csi world all of the class parameters and secrets will need to be combined to be csi secrets to support the same general idea since there is no notion of storage classes, but that same general idea should be feasible (if secrets are always passed along).

Happy to provide more details where helpful.

j-griffith commented 5 years ago

This makes sense, I'm happy to propose it or if you'd like to that's fine as well. Go ahead and assign it to me if you want, or yourself etc.

travisghansen commented 5 years ago

@j-griffith I'm happy to do whatever..not sure what you mean by propose in this context though. Open a new issue?

ShyamsundarR commented 5 years ago

Ceph CSI, as noted, requires secrets to read and list the volumes provisioned.

Further, as discussed by @travisghansen it also needs VolumeCreate parameters (using kubernetes parlance StorageClass parameters). The need being, to identify which storage end point needs to be listed and also to pass and use the appropriate secrets that belongs to the right end point. This is needed as a single Ceph CSI plugin instance can serve storage from different Ceph pools or filesystems, or even different Ceph clusters, based on the VolumeCreate parameters.

IMHO, using the secrets to carry the same parameters as the VolumeCreate parameters, is a workaround and not a solution to the problem. If the parameters could be and hence were passed to the respective calls including the secret, it would be a cleaner interface. Although, hearsay has it that, when the CO invokes ListVolumes/Snapshots it may not have a copy of the parameters to pass.

Alternatives that were considered and dropped look similar to what others have posted,

Both, were found inadequate from a resource intensiveness perspective (more controller and node plugins is a cluster), possibility of sharing multiple end points secrets within the same namespace and/or having to use coarse grained secrets (ones with wider ACLs so to speak) for the provisioning needs.

travisghansen commented 5 years ago

@ShyamsundarR I think my understanding is the same as yours...especially if you're referring to the volume_context.

It appears our storage designs/moving parts are very similar (different pools, clusters, etc, etc) and as a result the design goals of the respective drivers seem similar as well.

Arbitrary volume metadata NOT being stored by the CO and passed around to DeleteVolume etc is a giant PITA with CSI for the type of architecture/design we're coming from (where a bunch of details of the storage system(s) are not implicit as they are in various clouds etc).

This is especially true if you want the driver to remain stateless, less so if not. However, user-expectations are generally that the driver(s) will be deployed in the/a cluster, which results in a classic chicken/egg scenario of:

Driver: hey cluster, I need storage so I can be stateful Cluster: hey driver, I'll let you know when I've got a driver that can provide storage

This has been discussed previously but if it could be revisited that would be fantastic.

For my particular purposes I came up with a hair-brained idea to store metadata like this by setting zfs native properties on the applicable datasets/volumes. This is all theory at this point, but for some operations it's incredibly inefficient since I'll have to scan all child datasets in some scenarios and then 'query' the results in memory. It's not elegant but it keeps the driver stateless.

j-griffith commented 5 years ago

@ShyamsundarR

The need being, to identify which storage end point needs to be listed and also to pass and use the appropriate secrets that belongs to the right end point. This is needed as a single Ceph CSI plugin instance can serve storage from different Ceph pools or filesystems, or even different Ceph clusters, based on the VolumeCreate parameters.

I'm not sure I agree that it's up to the CSI Spec or the CO to know how to deal with aggregate plugins. If you choose to aggregate multiple endpoints/devices behind your plugin the details of how that works are up to you IMHO. Storage class constructs seems to work, but that's outside the scope of the CSI Spec here IMHO.

IMHO, using the secrets to carry the same parameters as the VolumeCreate parameters, is a workaround and not a solution to the problem.

To be fair, the problem statement in this Github issue was precisely a request to add acceptance of the secrets parameter to the list request calls in the spec.

If the parameters could be and hence were passed to the respective calls including the secret, it would be a cleaner interface. Although, hearsay has it that, when the CO invokes ListVolumes/Snapshots it may not have a copy of the parameters to pass.

So what you'd like to see is the credentials used for create basically stored by the CO and have the CO be responsible for including those credentials used at create time for any request operation involving that object (volume or snapshot). I don't necessarily know that this is any different in terms of the interface, you're still passing this information in for the other calls. In other words it would have to be added to the spec and the interface regardless no?

It seems to me that this is more about catering to a stateless solution for plugins that aggregate multiple storage devices or endpoints so I guess it needs a good deal more discussion. It seems storage classes isn't doing what you want here.

j-griffith commented 5 years ago

@j-griffith I'm happy to do whatever..not sure what you mean by propose in this context though. Open a new issue?

I meant propose a change to the spec via PR, but with the further context it sounds like this isn't what you really want anyway. Instead it seems that you want the CO to store creds from the create requests and pass them through to subsequent calls to your plugin. That's still going to require something passed to the plugin, although there other ways to do that besides the API Interface (not saying that's the right or wrong answer here).

travisghansen commented 5 years ago

@j-griffith I'd agree that generally it needs a good deal more discussion. Regarding the purpose of the CSI spec, I'd hope it's goal is to facilitate easy to use storage for end-users across COs. It seems clear to me there are some gaps here that are painful for folks who aren't dealing with cloud storage solutions (ie: aws, gcp, azure, etc) where many of the details are unified and implicit to the driver.

I could be off here but I don't think anyone would think it's a good idea to store credentials on a per-volume or per-snapshot basis. I suspect part of the fundamental issue is that the spec has too general a use of secrets (as opposed to controller_secrets, volume_secrets etc). I'm not sure what others would say but for me the idea would be:

  1. the notion of controller_secrets created in the spec which would essentially tie to what kubernetes considers a StorageClass (ie: each storage class or CO equivalent would/could have it's own set of secrets)
  2. Every call to either the Controller or Node services includes controller_secrets (with perhaps a very few exceptions)
  3. In addition, all (volume related) endpoints of the Node service send volume_secrets
  4. COs be required to store volume/snapshot metadata (foo_context) as sent back by the driver in respective create calls. Subsequently the respective post-create calls always send that data along. Actions on 'sub' resources would also include 'parent' metadata, for example, creating a snapshot would include volume_context, deleting a snapshot would include both volume_context and snapshot_context (in addition to the various foo_secrets)..if that makes any sense

We've got IBM, Ceph, OpenStack/Cinder, lowly me, and probably others via other issues that are seemingly trying to accomplish similar goal of stateless operation with the non-implicit constraints that are afforded some of the 'cloud' drivers. Some may even be dealing with blackbox solutions further complicating the matter.

I for example have scratched my head a few times at the NetApp Trident install process and management/configuration approach. After diving into the csi spec I have a better idea why they designed some of it the way it is. They may have had other reasons but I suspect much (if not all) of it is driven exactly by the challenges being expressed on this thread. The chicken/egg dance I mentioned earlier..they do that..and it's (far) less than ideal as an end-user for multiple reasons.

My voice isn't worth much as I'm sure people far smarter than I are dealing with this stuff so take the above with a grain of salt. I'm just an over-zealous user/programmer :) Having said that, Colorado is a short flight for me, I'd be happy to hop on a plane and whiteboard/brainstorm some of this stuff if it'd be helpful.

j-griffith commented 5 years ago

@travisghansen I think there might be some confusion here after the earlier posting. I think I understand the use case you're looking for here and I think it still aligns with the initial statement in the issue about suggesting the addition of a parameter (secrets map) in the remaining requests.

In addition, the details that you and @ShyamsundarR have pointed out would also need to be implemented/proposed to the CO (in this case kubernetes). The CSI spec is just proposing what information it accepts, there's a need for an entire discussion around the implementation needed by Kubernetes. The point being however, if you want the CO to manage and issue these credentials on controller calls (and/or node calls) that seems perfectly legit. So step one I think is you need something in the CSI Interface to express that. Now, figuring out the implementation and changes in the CO (Kubernetes) is another step.

By the way, YOUR VOICE is worth a lot! I think what you're proposing is perfectly sane. This is an open source community, the more participation and input and the more storage devices that work well with the CSI spec the better! To be clear, I just saw the initial GH issue and thought "hey, that seems like a good point maybe we should look at updating the spec to enable that".

travisghansen commented 5 years ago

@j-griffith I was approaching this issue as a "I'll take what I can get" situation :) If the door really is open to re-thinking some of this stuff more deeply then let's go for it!

saad-ali commented 5 years ago

We discussed this at the CSI community meeting. Initially I was opposed to the proposal, because in a traditional storage system, any credential required to auth with the backend should be populated in to the driver out of band from the CSI spec (e.g. for Kubernetes the CSI driver pod has secrets injected at deployment time). However, some one pointed out that some storage systems (like Ceph) may have multiple "storage pools" where each storage pool has different credentials. In that case, operations like ListSnapshot would require credentials (for the storage pool). Which makes sense to me. @hakanmemisoglu will implement this.

shay-berman commented 5 years ago

I would like to raise a suggestion that related to this secrets keys discussion.

The best practice for CSI driver is to get the secrets from the storage class by setting the relevant secret keys like : csi.storage.k8s.io/provisioner-secret-name: fast-storage-provision-key csi.storage.k8s.io/controller-publish-secret-name and also the snapshot secrets and others that related to the controller side of thing.

There are CSI driver that require single secret for all the controller APIs, because the CSI driver need the same credential to communicate with the backend storage in order to create\delete\publish\unpublish\list\snapshot...

What do you think about having a new key that set a secret for all CSI controller APIs?

Thoughts?

ShyamsundarR commented 5 years ago

In the case of Ceph and the way its CSI plugins are structured, we would need the parameters (passed in during CreateVolume requests) in addition to the secrets to determine where to list volumes from.

The secrets just carry the credentials for use with a cluster, whereas the parameters define which cluster (and pool or other such details).

One of the arguments in the past has been, to pass the cluster/pool and related details as part of the secret to avoid this limitation (across other CSI requests as well). Based on what secrets are defined for, stashing cluster details within them is not the right manner of passing these details to the plugin. IOW, which cluster or pool are not sensitive secret information to be stashed within the same, it seems to be a workaround, to implement, requests that only pass secrets as arguments.

In Ceph CSI we have chosen not to obfuscate this distinction and add support for such (secrets only in arguments) RPC requests. I believe this may prove to be true for other storage providers as well, that do not already have a provisioner that the CSI is merely front ending (i.e CSI is just a pass through to an already existing multi-cluster/multi-user provisioner like a clout provider API/endpoint).

In the case of Kubernetes for example, the parameters are (predominantly) from the StorageClass and the required cluster/pool/fs details are mentioned in the same.

Requirement 1: To effectively use secrets being passed into the List functions, we would also need the parameters added to the same.

This brings us to ListSnapshots, whose CreateSnapshot request is driven by a source_volume_id. In this case, the snapshot resides in the same cluster as the source volume and its location details are derived from the source_volume_id during a create request. When a list request with secrets is made, and as the source_volume_id is optional in the ListSnapshots request, Ceph CSI drivers are again clueless where to list snapshots from, given just the secrets.

Soft requirement 2: There is no hard requirement here yet, as we can choose to return an error when these optional parameters (namely either of source_volume_id or snapshot_id) are not present in the ListSnapshot request, but it would be nice to consider calling out some capabilities that can let CO know that such requests are not supported.

humblec commented 5 years ago

I would like to raise a suggestion that related to this secrets keys discussion.

The best practice for CSI driver is to get the secrets from the storage class by setting the relevant secret keys like : csi.storage.k8s.io/provisioner-secret-name: fast-storage-provision-key csi.storage.k8s.io/controller-publish-secret-name and also the snapshot secrets and others that related to the controller side of thing.

There are CSI driver that require single secret for all the controller APIs, because the CSI driver need the same credential to communicate with the backend storage in order to create\delete\publish\unpublish\list\snapshot...

What do you think about having a new key that set a secret for all CSI controller APIs?

Thoughts?

@shay-berman @saad-ali @j-griffith I actually like this proposal. May be can improve or enhance this thought and get into a better design. But, having credentials from SC and also specifying these credentials on a group level ( ex: controller or snapshot..etc) could help us to remove the complexity of passing down the credentials field in different RPC calls separately or validating how many RPCs in a specific operation need credentials..etc. At present, in most of the cases, we are having a discussion on whether we need or really need the credential for that specific RPC call or chasing on the use cases from different storage vendors..etc. If we make this some more flexible like above, may be we can sort out the credential issues to an extent in each or some RPC calls. It is also possible to have different keys which are specific to an operation in the higher level secret. For example, if the controller specific actions have a secret, SP can instruct to carve out a secret in such a way snapshot , createvolume sections/keys and read the data accordingly ( if required).

Just some random thoughts, please feel free to point out if I missed some obvious thing.

lpabon commented 5 years ago

If I can throw in this idea again: https://github.com/container-storage-interface/spec/issues/354#issuecomment-470178817

If the gRPC metadata was used instead of a field in each object, then all objects (current and future) would get the capacity for secret.

bells17 commented 4 years ago

Should this issue change to the close status? https://github.com/container-storage-interface/spec/pull/372 was already merged.

xing-yang commented 4 years ago

Yes, this issue should be closed now.