janko / uppy-s3_multipart

Provides Ruby endpoints for AWS S3 multipart uploads in Uppy
https://uppy.io/docs/aws-s3/
MIT License
65 stars 21 forks source link

Incompatibility with last uppy release regarding batching #23

Closed Intrepidd closed 3 years ago

Intrepidd commented 3 years ago

Since uppy 2.x I got the following error :

 ArgumentError (expected params[:part_number] to be an Integer, got value "batch" (class: String) instead.):

Looks like it comes from this commit : https://github.com/transloadit/uppy/commit/d613b849a6591083f8a0968aa8d66537e231bbcd#diff-bfb35efb13daef9d39e5d9a560b0a4ac5b0ad1c4c76bb461dc3724d5e5d7da7cL108-L109

What I understand is that uppy switched to batch uploading the attachment parts in one HTTP call which is not supported by this gem yet

I am not familiar with the S3 api but would love to help with a PR if you give me some pointers on how to implement it

hms commented 3 years ago

@janko Just a friendly bump as I just ran into this issue too/. Like Intrepidd, I'm willing to take a crack at fixing with a little guidance.

janko commented 3 years ago

Thanks for brining this to my attention. I would definitely appreciate a pull request. The things that are needed:

  1. adding the new GET /:upload_id/batch endpoint to Uppy::S3Multipart::App
    • the implementation should follow the uppy companion's implementation
    • the corresponding OPTIONS endpoint might be needed as well, I haven't checked if the new Uppy version sends that request as well
  2. the logic for generating the presigned URLs should be implemented in Uppy::S3Multipart::Client
    • you could probably call to #prepare_upload_part, it seems the logic for generating the signed URL is the same
    • generating the presigned URL for each part in a single request might be slow (there can be thousands of parts AFAIK); I see that the Node implementation parallelizes them, but I don't think we'll gain anything from parallelizing them in Ruby, because MRI cannot execute ruby code in parallel

Hopefully that's enough to get you started, let me know if you'll have any further questions.

jrochkind commented 3 years ago

Unfortunately the ruby AWS SDK is particularly slow at signing S3 URLs. But I guess that's no different than what it's always been doing, it shouldn't be a performance regression to do it in only one HTTP request, should be the same as ever?

I did write a faster S3 signing implementation. https://github.com/jrochkind/faster_s3_url

It's too bad to use uppy with direct to S3 uploads, we're stuck reverse engineering an ever-changing uppy server-side app, there's no standard for this. :(

janko commented 3 years ago

But I guess that's no different than what it's always been doing, it shouldn't be a performance regression to do it in only one HTTP request, should be the same as ever?

If the upload of the first chunk will start only after the request for fetching signed URLs for all parts finishes, then that could be a UX issue, because for the end user it would mean bigger delay before the progress bar starts moving. Previously a chunk would start uploading as soon as its signed URL was retrieved (via the endpoint for retrieving signed URL for a single part).

I did write a faster S3 signing implementation. https://github.com/jrochkind/faster_s3_url

It seems to me this is only for object URLs, not for custom operations such as upload_part.

It's too bad to use uppy with direct to S3 uploads, we're stuck reverse engineering an ever-changing uppy server-side app, there's no standard for this. :(

Well, anyone can run the uppy companion server instead of using this gem, which will be up-to-date. This gem exists mainly for convenience if you don't want to run a separate node process. Also, this is the first time there was a backwards incompatible change, and it was only introduced in a new major version of Uppy.

janko commented 3 years ago

According to the official blog post, it seems the new endpoint is not going to receive all parts, but batches of parts (like the endpoint name suggests). So I don't think batch URL signing will cause a performance issue.

janko commented 3 years ago

Thanks to @Intrepidd, version 1.2.0 now ships with the new endpoint 🎉

jrochkind commented 3 years ago

Thanks for the quick action y'all, really appreciate it!