TritonDataCenter / node-manta

Node.js SDK for Manta
75 stars 54 forks source link

joyent/node-manta#301 Add an option to that will upload a file using the Manta multipart upload API #371

Open joyent-automation opened 4 years ago

joyent-automation commented 4 years ago

joyent/node-manta#301 Add an option to that will upload a file using the Manta multipart upload API

This PR was migrated-from-gerrit, https://cr.joyent.us/#/c/3107/. The raw archive of this CR is here. See MANTA-4594 for info on Joyent Eng's migration from Gerrit.

CR discussion

Patch Set 1 code comments
bin/mmpu#515 @joyent-automation

warning: identifer ctx hides an identifier in a parent scope

bin/mmpu#517 @joyent-automation

warning: identifer filename hides an identifier in a parent scope

bin/mmpu#524 @joyent-automation

warning: undeclared identifier: size

bin/mmpu#529 @joyent-automation

warning: undeclared identifier: size

bin/mmpu#529 @joyent-automation

warning: undeclared identifier: size

bin/mmpu#535 @joyent-automation

warning: empty statement or extra semicolon

bin/mmpu#537 @joyent-automation

warning: identifer ctx hides an identifier in a parent scope

bin/mmpu#560 @joyent-automation

warning: identifer ctx hides an identifier in a parent scope

bin/mmpu#562 @joyent-automation

warning: undeclared identifier: parts

bin/mmpu#565 @joyent-automation

warning: undeclared identifier: funcs

bin/mmpu#569 @joyent-automation

warning: undeclared identifier: upload_part

bin/mmpu#569 @joyent-automation

warning: identifer callback hides an identifier in a parent scope

bin/mmpu#581 @joyent-automation

warning: undeclared identifier: upload_part

bin/mmpu#581 @joyent-automation

warning: undeclared identifier: funcs

bin/mmpu#584 @joyent-automation

warning: undeclared identifier: funcs

bin/mmpu#592 @joyent-automation

warning: empty statement or extra semicolon

bin/mmpu#594 @joyent-automation

warning: identifer ctx hides an identifier in a parent scope

bin/mmpu#609 @joyent-automation

warning: empty statement or extra semicolon

Patch Set 4 code comments
bin/mmpu#272 @jordanhendricks

This seems like an unintentional deletion.

Patch Set 5 code comments
test/mmpu.test.js#683 @joyent-automation

warning: variable is declared but never referenced: largeFileMd5Digest

test/mmpu.test.js#683 @joyent-automation

warning: missing semicolon

@jordanhendricks commented at 2018-01-02T18:04:52

Patch Set 6:

(13 comments)

Thanks for taking this on! Sorry for the delay getting to reviewing it.

Out of curiosity, was there a reason you wanted to do this under mmpu (instead of mput as I suggested in the ticket)? I think it could make sense under either command, but just wanted to know what your thoughts were.

Patch Set 6 code comments
bin/mmpu#272 @jordanhendricks

This seems unintentional.

bin/mmpu#272 @timkordas

indeed. I probably accidentally stripped it out when I pulled some other debugging out. thanks!

bin/mmpu#541 @jordanhendricks

As I noted below, I think we would probably want the program to calculate this for the user.

bin/mmpu#541 @timkordas

sure! I didn't realize I was doing this wrong. So by re-ordering the steps in the pipeline below (calling splitFile first) I think we get this "for free."

bin/mmpu#547 @jordanhendricks

If you get an error from create-mpu, it's expected that you don't need to abort the MPU (because it doesn't exist).

bin/mmpu#547 @timkordas

great!

bin/mmpu#564 @jordanhendricks

This is tricky. To deal with transient errors, I think we would want retry uploading some reasonable number of times, like mput does. In the case of a non-transient error, I think we would probably want to let the user deal with it. If, for instance, all but 1 parts of a large file uploaded successfully, it would be inconvenient of the program aborted the MPU. On the other hand, as it is now, I think it would be difficult to retry the upload by hand as a user, since you would need to manually split the file based on which parts failed.

What do you think about having an option to retry uploading parts for a given MPU ID? A really basic implementation would do the entire split/upload process again, only without initiating the MPU. (This could be under another ticket too.)

bin/mmpu#564 @timkordas

the retries make sense to me, I'll try to put something together which does that.

bin/mmpu#574 @jordanhendricks

Would it be better to use the vasync.parallel construct here, so we can upload many parts at once?

bin/mmpu#574 @timkordas

I guess I assumed we'd be I/O bound, so didn't see the advantage. but sure!

bin/mmpu#593 @jordanhendricks

I think my comments on the similar situation in doUploads() apply here, too.

bin/mmpu#593 @timkordas

this one is maybe tricker, and depends on what the nature of the failure is. I'll have to think about it a bit.

bin/mmpu#625 @jordanhendricks

I think we need to close the client and invoke the callback in either case, right?

bin/mmpu#625 @timkordas

that sounds right to me.

bin/mmpu#653 @jordanhendricks

It seems like this option wouldn't be needed since the program can stat the file.

bin/mmpu#653 @timkordas

Done

bin/mmpu#659 @jordanhendricks

Would it be better to call this something like "PARTSIZE" to better mirror the API terminology? Or is that confusing?

bin/mmpu#659 @timkordas

Done

bin/mmpu#669 @jordanhendricks

mput has an option to calculate the MD5 for the user. I think that's probably what we would want to do here, too. What do you think?

bin/mmpu#669 @timkordas

ah, I was mirroring the API of the mmpu-create call; but your suggestion makes more sense.

I'll make this a boolean and do the calculation (and will upate this help text).

bin/mmpu#680 @jordanhendricks

I think I would do a more use-case oriented summary here. Something like:

"Given a local file and a path in Manta, upload the file to Manta as a multipart upload."

bin/mmpu#680 @timkordas

Done

bin/mmpu#685 @jordanhendricks

s/mpu/mmpu

bin/mmpu#685 @timkordas

Done

@timkordas commented at 2018-01-02T22:37:19

Patch Set 7:

(7 comments)

@jordanhendricks commented at 2018-01-12T00:39:34

Patch Set 7:

PS 6 looks good! I have a few comments from the first round of review I don't see responses on -- I wasn't sure if you were working on feedback for those or not.

Thanks again for doing this!

@timkordas commented at 2018-01-12T18:19:55

Patch Set 7:

(5 comments)

@jordanhendricks commented at 2018-01-22T18:07:05

Patch Set 9:

(9 comments)

Patch Set 9 code comments
bin/mmpu#496 @jordanhendricks

nit: pull this code into a helper function (since do_create uses it as well)?

bin/mmpu#496 @timkordas

that somewhat breaks the parallel structure (all of the commands in this file have quite similar pre-ambles), and is not super-great looking when the error-callback stuff is added in. I'll try.

bin/mmpu#496 @jordanhendricks

Sure. It's not a big deal if it's not easy to do.

bin/mmpu#546 @jordanhendricks

nit: I like to use brackets for if/else blocks to improve readability.

bin/mmpu#546 @timkordas

sure. I'll fix all similar occurrences. I prefer the braces, but BSD KNF requires their absence, so I've been trying to do it their way.

bin/mmpu#546 @jordanhendricks

Hah. I think I've just been bitten by one too many "forgot to add brackets when adding code to an else block" bugs!

bin/mmpu#562 @jordanhendricks

Assert that ctx.calculatedMd5 exists in this case?

bin/mmpu#562 @timkordas

Done

bin/mmpu#585 @jordanhendricks

Let's print a warning in this case notifying the user they may want to abort the MPU?

bin/mmpu#585 @timkordas

Done

bin/mmpu#586 @jordanhendricks

Should we be calling partCallback here?

bin/mmpu#586 @timkordas

yes, thanks.

bin/mmpu#595 @jordanhendricks

Should this be a vasync.parallel call?

bin/mmpu#595 @timkordas

it isn't clear to me that that's better or necessary. Is it ?

bin/mmpu#595 @jordanhendricks

I'm not sure how it would make it worse, and I assumed it would be the same level of code complexity to change it. But I haven't measured it, so I could be wrong.

bin/mmpu#615 @jordanhendricks

I would probably print a warning here telling the user they might want to abort the MPU if it can't be retried.

bin/mmpu#615 @timkordas

Done

bin/mmpu#693 @jordanhendricks

"Manta"

bin/mmpu#693 @timkordas

Done

bin/mmpu#699 @jordanhendricks

The "--md5" option differs for mmpu create and mmpu put.

bin/mmpu#699 @timkordas

Done

@timkordas commented at 2018-01-22T23:32:00

Patch Set 10:

(9 comments)

@jordanhendricks commented at 2018-01-25T17:16:04

Patch Set 10:

(5 comments)

Patch Set 10 code comments
bin/mmpu#593 @jordanhendricks

I think this should probably be logged to stderr. Same with the other warning(s).

bin/mmpu#672 @jordanhendricks

I think cmdln will print the error for you if it's passed to the callback, as it is here.