Open jordanhendricks opened 6 years ago
There are a couple ways I have thought about doing this:
# committing zero-byte object intentionally
$ mmpu commit bde8ee50-dcdb-6d09-a2e8-afcfcee440d3
mmpu: commit empty object? y
# committing zero-byte object unintentionally
$ mmpu commit bde8ee50-dcdb-6d09-a2e8-afcfcee440d3
mmpu: commit empty object? n
The upside of this approach is that it would likely prevent all users from ever making the mistake above. The downside is that we would be converting the CLI from non-interactive to interactive, which I am not really a fan of (and may require a version bump).
mmpu commit
that indicates you want the CLI to fetch the etags of an MPU's parts as they currently exist in Manta so that you don't have to specify them yourself.Something like:
$ mmpu commit -e bde8ee50-dcdb-6d09-a2e8-afcfcee440d3
This approach adds some latency to the CLI invocation, as it will need to make additional requests to fetch the upload directory entries. (With an API max of 10000 parts per multipart upload, and 1000 results max per directory listing, this means at worst 10 additional requests for a given commit.)
It does not necessarily prevent first-time users from accidentally committing an empty object, but a flag would appear in the flag list in the help message, which would hopefully mean it's more likely to be noticed by a user running mmpu commit -h
. Additionally, it would not change the CLI from non-interactive to interactive, as option 1 does.
With those tradeoffs in mind, I'm inclined to choose option 2. I am open to other ideas as well!
Is it just as important for the CLI that the user be very explicit about which parts they want? I wonder if we could make your suggested flag the default behavior and only require the full list of parts if the list that's on the server is ambiguous in some way. We could still allow people to specify specific etags.
@davepacheco: I don't think it's as important in most cases for the CLI. Having the flag on by default would be a breaking change, though.
I would also say I think it makes the most sense to have the flag on by default in the case that no etags are given (mmpu commit <id>
), and if etags are specified, (mmpu commit <id> <etag0> <etag1> ...
), then to assume those are the etags the user wants to use. This would prevent breakage for consumers doing non-zero-byte commits.
Yes, I agree that if the user specified etags, we should just use those. Given that, I think the only cases where this would come up are cases where the command would have failed anyway, so I don't think any correct usage would be affected. Right?
After some discussion offline, we came up with a scheme we think makes sense:
mmpu commit <id>
will look up the etags of all parts uploaded for the MPU and use those in the commit request. This behavior makes sense as the default, as it's what users seem to expect, and it has the added benefit of preventing user error when keeping track of etags of parts manually. It will also reflect the current behavior of creating a zero-byte object for MPUs that have no parts uploaded, as mmpu
will see that no parts have been uploaded, and commit no parts accordingly.mmpu commit <id> <etag0> <etag1>
), then mmpu
will not look up the etags for them and simply submit the ones provided. This is consistent with the current behavior of mmpu
, allowing scripts that commit non-empty subsets of parts uploaded to an MPU to work.mmpu commit --commit-empty
. This will break users that expect invoking mmpu commit <id>
to commit a zero-byte object even if parts are uploaded to create an empty object.This scheme requires a major version bump because of item (3): Users expecting mmpu commit <id>
to commit an empty object for an MPU that has parts uploaded to it will be broken by this change.
Committing a zero-byte object after uploading parts to the MPU is a rare case as far as I know. Additionally, this scheme seems like it will improve the usability for the vast majority of new users without breaking the most common uses of mmpu
--- committing parts uploaded to an MPU by specifying their etags manually. As such, I think it's a good approach to solving this.
A frequent mistake new users of
mmpu
make is not specifying part etags when committing the object:The MPU API requires part etags because parts are normal Manta objects and can thus be overwritten. Specifying the etag ensures the client is committing the correct object. To mirror the API, the
mmpu
CLI requires the etags the user wants to commit.Using the CLI is well-documented (in both the
mmpu commit
help message and themmpu
man page). Still, given that so many users expectmmpu commit
to commit parts they've uploaded without the additional etag specification step, it would be good to improve the usability of this command.