container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.34k stars 373 forks source link

Clarify staging and target paths for blocks #335

Closed bswartz closed 5 years ago

bswartz commented 5 years ago

The vague wording around the staging and target paths left open too many possible implementations for raw block volumes, and implementors ended up doing non-obvious things. This change clarifies the contract between the CO and the SP when staging and publishing raw block volumes on the node.

jdef commented 5 years ago

related to #287

jdef commented 5 years ago

I'm not crazy about the specificity of this PR. Though I understand that it does help to clarify things for SP authors. I'm hopeful that we can move to something more generic (in a non-breaking way) once we support something like what's been proposed in #96, where the CO has a bit more control over mount operations.

bswartz commented 5 years ago

@jdef I'm trying to understand the problems that the solution proposed in that issue is trying to solve. The discussion seemed to argue for what we have now, which is node plugins that do the mounts themselves. The only specific use case I could find (sshfs) strongly favors the current design. What would be gained by modifying the publish interface to offer another way?

saad-ali commented 5 years ago

@bswartz this is related to another discussion where some one proposed potentially allowing CO to control the final mount. See https://github.com/container-storage-interface/spec/issues/96#issuecomment-435147893

That would be an additive change we can consider in the future (post v1). I believe @jdef is arguing that would be a way to introduce the flexibility that is potentially lost with this PR.

jdef commented 5 years ago

@bswartz AFAICT the proposal in #96 would push CSI in the direction of the CO executing the mount syscall vs. the plugin. Also suggested by the OP of the proposal was that the same mechanism could be extended to support more exotic "mounting" of volumes, such as what might be needed for VM disk images - something that #166 would like to see supported as well.

Current thinking is that we could probably make backwards compatible changes to CSI, post-v1, such that the target_staging_path and target_path mechanism could be the default/fallback, and that we could add a new "capability" somewhere that lets the CO discover plugin support for more exotic/flexible volume "mounting".

The specificity of this PR, as it's currently written, cuts both ways: it probably simplifies things for most plugins that will be written against CSI v1.0.0; it also dictates bits of implementation and I think that's where the CO owners have expressed concern. Ideally, we'd avoid over-specifying implementation in the CSI spec as long as the interface between plugin and CO is clear (which is also the point of contention here, right?). From the community discussion today, I'm not 100% convinced that this PR is the best, long-term approach.

bswartz commented 5 years ago

@saad-ali I'm just trying to figure out why anyone would want to do that. What's the advantage of allowing the CO to control the final mount vs the status quo?

Also I don't see how the version of the spec prior to my change makes it any easier to do that. The existing spec provides NO mechanism for the SP to return anything to the CO on publish, so it's implied that something like what we current do is what MUST be done.

In the future we can add an optional mechanism to return something that the CO could use to perform the mount, but SPs could never assume the CO would want to use that mechanism and therefore would have to support the current model of mounting inside the SP as the default. This language change doesn't make an addition like that any more difficult.

jdef commented 5 years ago

That would be an additive change we can consider in the future (post v1). I believe @jdef is arguing that would be a way to introduce the flexibility that is potentially lost with this PR.

To be clear, I don't think we lose flexibility to do something for #96 with the introduction of this PR. I was trying to say that we still have an escape hatch if we find out (as I think we will) that this PR is too restrictive for all the use cases that we want to support.

saad-ali commented 5 years ago

Let's stay focused on this PR. We can discuss the merits of #96 on that issue.

bswartz commented 5 years ago

@jdef Okay I think I understand. As long as we agree that this PR doesn't make the problem you want to solve any harder. The staging path stuff shouldn't affect that future enhancement because it's really just a clarification of what SPs are already depending on, and doesn't impose a new burden on COs. The publish target path stuff could all be overridden by a hypothetical future enhancement to the node publish call.

bswartz commented 5 years ago

@julian-hj For filesystem (mount access type) volumes I absolutely agree, and will change the wording. For block access type volumes, I'm okay loosening the language as long as it's still clear what the CO should or should not create, so it's clear what's left for the SP to do.

bswartz commented 5 years ago

@lpabon I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod. In this case we could simply require that the CO ensures the parent directory exists. I can't imagine any problems with that approach, but @mkimuram will need to modify his kubernetes patch to accomodate that.

mkimura commented 5 years ago

You’re using the wrong tag

On Nov 14, 2018, at 5:50 PM, Ben Swartzlander notifications@github.com wrote:

@lpabon I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod. In this case we could simply require that the CO ensures the parent directory exists. I can't imagine any problems with that approach, but @mkimura will need to modify his kubernetes patch to accomodate that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

saad-ali commented 5 years ago

@mkimuram not mkimura

bswartz commented 5 years ago

Apologies for misspelling the name

mkimuram commented 5 years ago

@bswartz @lpabon

I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod.

If you are talking about NodePublishVolume/NodeUnpublishVolume, won't this change affect the assumption that there is no difference in NodeUnpublishVolume logic between block and mount access type? I understand that we originally assume that NodeUnpublishVolume requires just unmounting from target path in both case. However, if NodeUnpublishVolume is required to do something different from unmount, it might need information on which access type is requested, which is not passed to NodeUnpublishVolume, to decide what to do for the path. (The decision to choose other than mount/unmount is up to SP, so it could be at their own risk, though.)

bswartz commented 5 years ago

@mkimuram SPs only need to do something different if they create different things at the publish stage. If they just create a mounted directory or a bind mounted file, then blindly calling umount on the target path would be relatively safe. If on the other hand the stuff that gets created can vary significantly, then the SP would need to detect what got created by calling stat() on the target path at unpublish time or use an external mechanism to remember what kind of volume it is.