container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.32k stars 371 forks source link

spec: ListVolumesRequest should enable getting single/filtered volumes #54

Open akutz opened 7 years ago

akutz commented 7 years ago

I just realized the spec doesn't enable the retrieval of a single volume. To me this is a huge performance problem. The client shouldn't always be forced to retrieve an entire dataset for only one or more desired volumes.

I propose updating the ListVolumesRequest message like so:

message ListVolumesRequest {
  // The API version assumed by the CO. This field is REQUIRED.
  Version version = 1;

  // If specified, the Plugin MUST NOT return more entries than this
  // number in the response. If the actual number of entries is more
  // than this number, the Plugin MUST set `next_token` in the response
  // which can be used to get the next page of entries in the subsequent
  // `ListVolumes` call. This field is OPTIONAL. If not specified, it
  // means there is no restriction on the number of entries that can be
  // returned.
  uint32 max_entries = 2;

  // A token to specify where to start paginating. Set this field to 
  // `next_token` returned by a previous `ListVolumes` call to get the
  // next page of entries.
  string starting_token = 3; 

  // If specified, the Plugin MUST return ONLY the requested volumes.
  // This field is optional.
  repeated VolumeID volume_ids = 4;
}

This update would enable COs to request one or more specific volumes and enable the filtering to occur on the remote side, reducing the size of the transported data and compute resources necessary to filter the list on the client side.

jdef commented 7 years ago

What's the use case for retrieving a subset of vols based on volume ID (which is opaque to a CO)?

If a CO knows the vol ID already, shouldn't it have the rest of the information associated with the volume?

Is the assumption that the CO only has partial knowledge of a particular subset of volumes and wants to ensure that it has a refreshed copy of volume info before making decisions?

On Tue, Jun 27, 2017 at 10:38 AM, Schley Andrew Kutz < notifications@github.com> wrote:

I just realized the spec doesn't enable the retrieval of a single volume. To me this is a huge performance problem. The client shouldn't always be forced to retrieve an entire dataset for only one or more desired volumes.

I propose updating the ListVolumesRequest message like so:

message ListVolumesRequest { // The API version assumed by the CO. This field is REQUIRED. Version version = 1;

// If specified, the Plugin MUST NOT return more entries than this // number in the response. If the actual number of entries is more // than this number, the Plugin MUST set next_token in the response // which can be used to get the next page of entries in the subsequent // ListVolumes call. This field is OPTIONAL. If not specified, it // means there is no restriction on the number of entries that can be // returned. uint32 max_entries = 2;

// A token to specify where to start paginating. Set this field to // next_token returned by a previous ListVolumes call to get the // next page of entries. string starting_token = 3;

// If specified, the Plugin MUST return ONLY the requested volumes. // This field is optional. repeated VolumeID volume_ids = 4; }

This update would enable COs to request one or more specific volumes and enable the filtering to occur on the remote side, reducing the size of the transported data and compute resources necessary to filter the list on the client side.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/container-storage-interface/spec/issues/54, or mute the thread https://github.com/notifications/unsubscribe-auth/ACPVLJQaL2TaAkjkUWiV0jXc5Ty9ePMuks5sIRPxgaJpZM4OGujT .

akutz commented 7 years ago

Hi @jdef,

After I posted this issue it did occur to me that the idempotent nature of CSI's mutative RPCs means looking up a single volume may not be necessary, but that doesn't mean it won't occur.

As for your question:

If a CO knows the vol ID already, shouldn't it have the rest of the information associated with the volume?

Not necessarily. And this is also where I may be conflating two things or at best misunderstanding intent. As I understand things a VolumeID is nothing more than a map[string]string, meaning that a VolumeID could be this:

volumeID := &csi.VolumeID{
    Values: map[string]string{
        "id": "vol-000",
    },
}

A volume ID could also be this:

volumeID := &csi.VolumeID{
    Values: map[string]string{
        "name": "My New Volume",
    },
}

The nature of a volume ID, with respect to CSI, is by design completely opaque and up to the storage provider. The CSI specification doesn't even declare that a volume ID should or must be unique. Instead that task is left to the volume info of all things:

      // Contains all attributes of the newly created volume that are
      // relevant to the CO along with information required by the Plugin
      // to uniquely identifying the volume. This field is REQUIRED.
      VolumeInfo volume_info = 1;

Therefore, from the perspective of a CO a user could attempt some operation that requires inspecting a volume to retrieve its information. The only natural assumption or course of action is that a user might enter the name of a volume (assuming the CO enforces some level of uniqueness) and then use that information to obtain the additional volume information.

It seems to me that the CSI specification is trying to treat the VolumeInfo type as a sort of truth-table that can be used to uniquely identify a volume. In fact, a simple text search of spec.md shows the word unique is used only twice: once with respect to a NodeID and once with respect to VolumeInfo. If that is the case then it begs the question why the VolumeID is multi-valued itself. Because if it's the VolumeInfo that is used to determine uniqueness then IMO the volume the ID should be simply a string.

tl;dr - A VolumeID should uniquely identify a volume, but the CSI specification indicates that is the job of a VolumeInfo object. If that's the case then the VolumeID should be a single-valued type, not a map[string]string. Otherwise it's confusing.