container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.34k stars 373 forks source link

Add PublishedNodes field to ListVolumes Response #374

Closed davidz627 closed 5 years ago

davidz627 commented 5 years ago

Helps the CO reconcile the actual state when the volume may have been Unpublished from the node out of band from the CO.

/assign @saad-ali /cc @xing-yang @gnufied @julian-hj /cc @msau42

PTAL Discussion: https://docs.google.com/document/d/1-oiNg5V_GtS_JBAEViVBhZ3BYVFlbSz70hreyaD7c5Y/edit#heading=h.z24362cngqjs

saad-ali commented 5 years ago

CC @julian-hj @ddebroy

jieyu commented 5 years ago

Can you make sure the protobuf file is also updated accordingly. There's a makefile target for the generation.

davidz627 commented 5 years ago

How do we feel about adding something like:

List nodes published MAY return volumes not published or created by the driver. The CO MUST be resilient to that
jieyu commented 5 years ago

@davidz627 that sounds reasonable to me

davidz627 commented 5 years ago

@saad-ali @jieyu resolved comments, checks passing, improved wording, PTAL

davidz627 commented 5 years ago

One (tangential) q I have around this: Is there a reason NodeGetInfoResponse does not report the list of volumes that has been published to the node?

I don't think that there's a particular reason it doesn't. But for this proposal IMO it is more efficient to have the one list volumes call give back all volumes and where they're attached. Also for some reason it states specifically: The result of this call will be used by CO in ControllerPublishVolume. which is not what we'll be using it for.

gnufied commented 5 years ago

LGTM apart from the minor nit above.

davidz627 commented 5 years ago

@jdef @jieyu @julian-hj @ddebroy @saad-ali Anything else needed for approval/review?

gnufied commented 5 years ago

One thing worth mentioning here is - depending on SP, the published_node_ids field value might be out of date(stale). For example - on AWS because of eventual consistent nature of read api calls, the describe_volume API calls could return stale attachment information. In kubernetes - we handle this by processing out of band attach/detaches as "uncertain", because only a mutable api call guarantees that volume information was accurate.

So, if a CSI driver returns stale data in published_node_ids, is that non-compliant/buggy? What actions CO should take in that case?