greghendershott / aws

Racket support for Amazon Web Services.
BSD 2-Clause "Simplified" License
78 stars 26 forks source link

multipart-put and multipart-put/file: Better error handling #48

Closed greghendershott closed 8 years ago

greghendershott commented 9 years ago

As discussed in #46, the convenience functions multipart-put and multipart-put/file ought to handle things like exn:fail:network?. After all, being able to resume an interrupted upload is one of the main advantages of multipart uploads. Also, as a default in case the user doesn't want to deal with attempting to resume, the functions should automatically use abort-multipart-upload to clean up (so the user isn't paying for parts sitting on S3).

Quick/rough brain dump:

krrrcks commented 9 years ago

I think a failure-proc would be helpful to get the upload-id and parts-list to decide on how to go further. A function for suspending an upload would be nice but nothing I really need at the moment. And I could not think of a situation where I would need that.

greghendershott commented 9 years ago

A suspend function will be needed internally, to use when handling e.g. exn:fail exceptions. Multipart uploads use a pool of 4 threads. If one fails unrecoverably then the whole pool needs to be stopped gracefully.

After that's figured out, correctly, I think it would be helpful to provide it, too.

Example: Racket presents break, kill and hangup signals as exn:break exceptions. If the aws client is going to be killed intentionally, it would be helpful for it to catch these using with-handlers or call-with-exception-handler and suspend the multipart upload in a way that could be resumed later. (For example in many places residential broadband is not so fast, especially at uploads. Having to start over from scratch isn't great.)

Unlike exn:fail, I feel breaks should be left to the client of the aws library to handle, and it might want to use a suspend. Although I suppose I could handle breaks and re-raise them for the client. Just thinking out loud, here.


In any case, it may be awhile before I can work on this item, at all....

krrrcks commented 9 years ago

Yeah, sounds interesting and a really nice feature. For me that would have no high priority. So more a "nice to have" or "nice enhancement" feature.

Edit: I for myself try to monitor the my S3 usage with CloudWatch and if something looks weird I can dive into the multipart things with the AWS cli.

greghendershott commented 9 years ago

@krrrcks Thanks for letting me know that -- it helps me prioritize.

I shouldn't do this, at least not soon. [Unless I have time and want to work on it for fun. :)]

greghendershott commented 9 years ago

So this was bothering me and I kept working on it.

I spent some time exploring how to make the worker pool handle exn:break cleanly, and return lists of "done" and "to-do" parts. Then I realized it didn't matter. I could focus on resuming, regardless of how cleanly it got interrupted (and without the need to persist a list of done/to-do parts locally).

So I pushed a commit with a couple "experimental" functions: incomplete-multipart-put/file and resume-multipart-put/file. Although the package docs haven't rebuilt yet, the commit message and aws.scrbl changes should explain it pretty well?

I marked it experimental because my testing worked, but was fairly limited. Although I'm fairly confident it's OK to use List Parts this way, because I'm providing Content-MD5 checksums on the uploaded parts, and ensuring they match... I'm not 100% sure.

greghendershott commented 8 years ago

This has been open for awhile and I'm satisfied the commit closes this issue, until/unless someone feels otherwise.