alexmojaki / s3-stream-upload

Manages streaming of data to AWS S3 without knowing the size beforehand and without keeping it all in memory or writing to disk.
MIT License
208 stars 62 forks source link

Support AWS SDK version 2 #23

Open danfink opened 4 years ago

danfink commented 4 years ago

s3upload now has a dependency on AWS SDK version 1. AWS SDK Version 2 is more flexible and is what will be supported going forward. As more applications, like ours, upgrade to Version 2, it becomes more critical to support the current version from AWS.

alexmojaki commented 4 years ago

Do you think I should make the library support (but not require installing) both, and if so, how do I specify that in terms of maven dependencies and build?

danfink commented 4 years ago

That's an interesting question.

I can see that both versions can be supported at the same time (https://docs.aws.amazon.com/sdk-for-java/v2/migration-guide/aws-sdk-java-mg.pdf under "Using the SDK for Java 1.x and 2.x Side by Side") so it is possible to include support for both. That said, it seems that the following would support both older and newer users:

  1. Upgrade s3upload to only use SDK 2.
  2. If someone is using SDK 1, then both libraries will be included and they won't have a problem.
  3. If someone is using SDK 2, they will just have the SDK 2 library included. How does this sound?
alexmojaki commented 4 years ago

OK, I started on this in #24 but it's incomplete. I'm not planning on finishing it off myself unless I hear a real use case where the v2 SDK is actually needed. If you could help that'd be great.

oldirtybasti commented 4 years ago

Thank you for this great piece of software! I did upgrade my application to v2 SDK currently but unfortunately still need to have v1 on the classpath. Just wanted to let you know, that there are other people out there having the same problem.

alexmojaki commented 4 years ago

@oldirtybasti but what is the problem exactly with having v1 on the classpath? Is it just a bit more memory usage? Are there naming conflicts?

I'm glad you like the software but I work on a lot of open source software and this one is low on the list of priorities. If you really want this, consider helping with #24.

oldirtybasti commented 4 years ago

The technical impact indeed is minor. It is more about aesthetics: Keeping the classpath as tidy as possible and all artefacts small. I understand that this feature is of low priority.

ericpapaluca commented 2 years ago

OK, I started on this in #24 but it's incomplete. I'm not planning on finishing it off myself unless I hear a real use case where the v2 SDK is actually needed. If you could help that'd be great.

We're facing an issue where support for this would be fantastic as we are trying to integrate using AWS X-Ray with your library. This doesn't work with the V1 SDK as described here as a Won't Fix: https://github.com/aws/aws-sdk-java/issues/1572

I will have a look at #24 above and see if I can get it working in the mean time and contribute something back.

P.S. Love your tool :)

Kevin-Luke commented 4 months ago

Is there a timeline where support for V2 will be merged? Is there an alternative library for this?

alexmojaki commented 4 months ago

It requires someone else to do the necessary work that I started in https://github.com/alexmojaki/s3-stream-upload/pull/24. This library is too old and niche nowadays for me to spend time on it personally.

Is there an alternative library for this?

The v2 SDK itself is supposed to support it nowadays: https://github.com/aws/aws-sdk-java-v2/issues/139

alexmojaki commented 2 months ago

I'm still confused about this - doesn't the AWS v2 SDK now support this feature directly?

https://docs.aws.amazon.com/AmazonS3/latest/userguide/example_s3_Scenario_UploadStream_section.html

Why is this library still needed for v2 users?

cc @skiyooka

skiyooka commented 2 months ago

Unfortunately no, the AWS v2 SDK's putObject requires a mandatory length on the input stream. This is of course problematic in the use-case of creating a .zip on the fly (from S3 objects) and trying to stream it back to S3 as the size of the .zip is unknown until after it is fully created.

alexmojaki commented 2 months ago

Not putObject, but S3TransferManager.upload, like in the link above. The title says "Upload a stream of unknown size to an Amazon S3 object using an AWS SDK"

skiyooka commented 2 months ago

Ah you're correct. I was not aware of S3TransferManager and I somehow completely missed it when I was viewing the S3 docs.

In subsequent testing I am able to accomplish what I need to do by setting up a PipeInput/OutputStream pair and tie the .zip file creation to the S3TransferManager's upload a stream of unknown size.

So perhaps it may make sense to keep this library at awssdk v1 and for anyone using v2 pointing them to the AWS S3TransferManager.

alugowski commented 1 month ago

The example link is now: https://docs.aws.amazon.com/AmazonS3/latest/API/s3_example_s3_Scenario_UploadStream_section.html