Closed carlbraganza closed 7 months ago
What about this PR https://github.com/container-storage-interface/spec/pull/522?
What about this PR #522?
It has been closed.
cc @bswartz
cc @jdef
@carlbraganza can you make sure you've signed the CLA. See https://github.com/container-storage-interface/spec/blob/master/CONTRIBUTING.md#licence-and-cla
@carlbraganza can you make sure you've signed the CLA. See https://github.com/container-storage-interface/spec/blob/master/CONTRIBUTING.md#licence-and-cla
Thanks. Looking into getting this signed.
@jdef This comment was in an outdated part of the document so I'm not sure if it was addressed:
why aren't backend secrets configured at the plugin level? why do they need to be part of the RPC here? is this an exercise in future-proofing?
This was brought to our attention by @nixpanic, because the Ceph CSI plugin supports multiple sets of secrets (per Pool, I believe).
thanks for being patient. completed another round of review
Thanks @jdef. I tried to address all the items you surfaced. PTAL.
After responding to @bswartz's previous comment, it occurred to me that requiring the starting_offset
to be present in the first tuple range is not correct, because there may not be allocated/changed blocks in the range including that offset. What we really want is that the SP should _consider this startingoffset in its computation when generating the data stream. Accordingly, I changed the text to the following:
// The computation of the data range in the first BlockMetadata // message in the stream of GetMetadataAllocatedResponse messages // returned MUST consider the starting_offset requested in the // GetMetadataAllocatedRequest message, though it is not required // that the first data range tuple returned actually include the // starting_offset, as the first allocated block range may start // after this offset.
The language is likewise modified in the description of the GetMetadataDeltaResponse
message.
I don't know if this negatively impact's @bswartz's request for precise language that can be tested, but it is more accurate.
I'm thinking somewhere in the spec.md file (not the proto file) in the section about change-block tracking we need to mention in plain English how the SP is responsible for ensuring EVERY changed block gets communicated to the CO before reporting the end of stream, and that the CO is responsible for picking the correct starting_offset
when restarting an interrupted connection, in order to ensure that it doesn't miss any blocks. The stakes are high for not screwing up this metadata transmission, so it's worth recommending caution to implementers not to be careless.
@bswartz wrote:
I'm thinking somewhere in the spec.md file (not the proto file) in the section about change-block tracking we need to mention in plain English how the SP is responsible for ensuring EVERY changed block gets communicated to the CO before reporting the end of stream, and that the CO is responsible for picking the correct
starting_offset
when restarting an interrupted connection, in order to ensure that it doesn't miss any blocks. The stakes are high for not screwing up this metadata transmission, so it's worth recommending caution to implementers not to be careless.
I think this is a great idea. I'm adding the following section between the description of the metadata format and the RPC definitions:
### Special considerations for processing a metadata gRPC stream
The remote procedure calls of this service return snapshot metadata within a gRPC stream of response messages.
There are some important things for the SP and CO to consider here:
- In normal operation an SP MUST terminate the stream only after ALL metadata is transmitted,
using the language specific idiom for terminating a gRPC stream source.
This results in the CO receiving its language specific notification idiom for the end of a gRPC
stream, which provides a **definitive indication that all available metadata has been received**.
- If the SP encounters an error while recovering the metadata it MUST abort transmission of the
stream with its language specific error idiom.
This results in the CO receiving the error in its language specific idiom, which will be different
from the language specific idiom for the end of a gRPC stream.
- It is possible that the gRPC stream gets interrupted for arbitrary reasons beyond the control
of either the SP or the CO.
The SP will get an error when writing to the stream and MUST abort its transmission.
The CO will receive an error in its language specific idiom, which will be different from the
language specific idiom for the end of a gRPC stream.
- In all circumstances where the CO receives an error when reading from the gRPC stream it
MAY attempt to **continue** the operation by re-sending its request message but with a
`starting_offset` adjusted to the NEXT byte position beyond that of any metadata already received.
The SP MUST always ensure that the `starting_offset` requested be considered in the
computation of the data range for the first message in the returned gRPC stream, though the
data range of the first message is not required to actually include the `starting_offset` if there
is no applicable data between the `starting_offset` and the start of the data range returned by the first message.
(Updated after @jdef's feedback and formatted for display above).
skimmed through the changes yesterday, LGTM
only remaining nit is the needless newline whitespaces that pad the new protobuf blocks
@carlbraganza Can you squash the commits?
@carlbraganza I can't find evidence of a signed CLA. Did that end up happening?
BTW I can squash the commits too, but we should add some more detail to the squashed commit message.
Rebased on master and squashed comments.
As soon as we get a signed CLA from the author(s) we can merge this.
I can confirm we've received CLA from Veeam including Carl Braganza
I looked at the build test failure - my guess is that I used the wrong version of protoc.
You probably need to rebase on HEAD and rebuild. We merged a patch that changed the Makefile significantly and also updated the protoc compiler version
What type of PR is this? This is a proposed addition to the CSI specification.
What this PR does / why we need it: It adds a SnapshotMetadata service through which one can obtain allocated or changed block metadata for snapshots.
Which issue(s) this PR fixes: n/a
Special notes for your reviewer: Derived from https://github.com/kubernetes/enhancements/pull/4082
Does this PR introduce an API-breaking change?: