axman6 / amazonka-s3-streaming

Provides a conduit based interface to uploading data to S3 using the Multipart API
MIT License
20 stars 23 forks source link

Request to support amazonka 2.0 #27

Closed jhrcek closed 1 year ago

jhrcek commented 2 years ago

Hello @axman6 , thank you very much for this working on this library, I'm one of its happy users. amazonka 2.0 release candidate is out (https://github.com/brendanhay/amazonka/discussions/716). I'd like to upgrade to it, but this will require this library to migrate to it too. Are you planning to upgrade this library to support it?

If you're too busy, I could try opening a PR (not sure yet how much effort it would be).

ghost commented 2 years ago

Hey @jhrcek, thanks for getting in touch. Yes I do plan to make an update for Amazonka-2.0 when I have some time. The first thing that IU would like to happen is to get #25 merged and released (probably as v2.0 of this library), and then port that to Amazonka-2.0 as v3.0 of this library. If you are using the library, I would greatly appreciate any input you have on that PR - @endgame has done a great job but neither of us are actually using the package in anger.

endgame commented 2 years ago

That sounds like a good plan to me. @-me when you have the amazonka-1.x->amazonka-2.0 PR up for review?

jhrcek commented 2 years ago

Alright, I'll test out https://github.com/axman6/amazonka-s3-streaming/pull/25 with our application and let you know if things are working for us. I expect to get to it today or tomorrow.

jhrcek commented 2 years ago

As promised I tested the 1.2.0.0 and found some issues. Let me know if you'd like some more help with tracking down the bug.

bitc commented 2 years ago

For anyone who stumbles upon this issue, it appears that the bug that @jhrcek encountered was corrupt uploads (described here: https://github.com/axman6/amazonka-s3-streaming/pull/25#pullrequestreview-820127399), and it appears that it has been already fixed here: https://github.com/axman6/amazonka-s3-streaming/commit/aa9f3f357eb4d92650b136cd2a18001156353142

jhrcek commented 2 years ago

The support for amazonka 2.0 has been implemented in https://github.com/axman6/amazonka-s3-streaming/pull/28

@axman6 are you planning to release this on hackage (probably after amazonka 2.0 is released)? I'm not in a hurry, just want to get rough idea what to expect..

axman6 commented 2 years ago

Correct, I won’t be releasing to Hackage until Amazonka 2.0 is out, which hopefully isn’t too far away (@endgame is doing an amazing job with @brendanhay on pushing towards its release). If you are able to test master against the latest Amazonka main branch, I would appreciate any feedback. I have one change I will probably make internally, but it should be basically ready.

jhrcek commented 2 years ago

Great. I already migrated our app to amazonka 2.0 (the latest master) + this repo (also master) and tested it by uploading bunch of files of different sizes. So far everything seems to be working fine, though I must say the only use of this library in very rudimentary way:

streamUploadFileImpl ::
    forall m env.
    ( WithS3 env m
    , WithLog env m
    ) =>
    S3Bucket ->
    S3Key ->
    FilePath ->
    m ()
streamUploadFileImpl (S3Bucket bucket) (S3Key key) filePath = do
    awsInfo <- grab @AwsInfo
    result <-
        runConduitRes $
            sourceFile filePath
                .| streamUpload
                    (awsEnv awsInfo)
                    Nothing
                    (newCreateMultipartUpload (BucketName bucket) (ObjectKey key))
    case result of
        Left (_abortMultipartUploadResponse, someException) ->
            logW $ "Failed to upload file " <> toText filePath <> " to " <> bucket <> " and key " <> key <> " due to " <> toText (displayException someException)
        Right _completeMultipartUploadResponse ->
            pure ()
axman6 commented 2 years ago

Can you try using the concurrentUpload function instead? Even if you decide to go with the lower resource usage of the streaming version, it would be good to validate it in someone else’s code. Please make sure you fetch some files and check their hashes - I would hate to lose people’s data!

jhrcek commented 2 years ago

Sure. I modified the code as in the following snippet and tried to upload bunch of zip archives containing images ranging in sizes from 1 - 100 mb. I could unzip the files and view the images upon re-downloading. Also checked the sha256sum of the files and they didn't change. So I'm pretty confident that concurrentUpload is not screwing up the content. Nice work :+1:

streamUploadFileImpl ::
    forall m env.
    ( WithS3 env m
    ) =>
    S3Bucket ->
    S3Key ->
    FilePath ->
    m ()
streamUploadFileImpl (S3Bucket bucket) (S3Key key) filePath = do
    awsInfo <- grab @AwsInfo
    let numThreads = Just 4
    void
        . liftIO
        . runResourceT
        $ concurrentUpload
            (awsEnv awsInfo)
            Nothing
            numThreads
            (FP filePath)
            (newCreateMultipartUpload (BucketName bucket) (ObjectKey key))
axman6 commented 2 years ago

Fantastic, thanks for confirming. We had some fun yesterday trying to figure out just how fast things could be sent, but didn’t end up with anything really worth sharing.

axman6 commented 1 year ago

@jhrcek Are you happy to close this issue? I think everything is sorted now?

jhrcek commented 1 year ago

Sure, let's close it. I'm fine as long as you're ok with releasing the lib after amazonka 2.0.0 is released on hackage :smiley:

jhrcek commented 1 year ago

And thanks for all your work on this!

jhrcek commented 1 year ago

Hello @axman6 friendly reminder: could you please release new version with 2.0.0 support to hackage? :pray:

axman6 commented 1 year ago

Thanks @jhrcek yeah I had meant to get to that a few weeks ago, but #life got in the way. I've just uploaded a release candidate, but I probably need to sort out the dependency bounds, and I don't really know how to do this properly (@endgame halp?).

endgame commented 1 year ago

Try amazonka >=2.0 && <2.1 or amazonka ^>=2.0. amazonka changed too much to try supporting 1.6.1 and 2.0 in the same source release, unless you really like CPP.

Use the same bounds for amazonka-s3 and also for amazonka-core if you depend on it. (You shouldn't need to depend on amazonka-core directly. If you do, I'd like to know why — there's a chance I'm not re-exporting something that should be.)

jhrcek commented 1 year ago

As a general principle, before publishing new version to hackage, I'd also recommend running cabal update and then cabal outdated to check if the upper bounds don't unnecessarily exclude latest versions of some dependencies.

For this package:

$ cabal outdated
Outdated dependencies:
bytestring >=0.10.8.0 && <0.12 (latest: 0.12.0.2)

though I didn't check how much work it would to actually support bytestring <0.13..

axman6 commented 1 year ago

Ok, I have managed to update the bounds to maintain building back to GHC 8.10.7 - @jhrcek Can you try the latest candidate from https://hackage.haskell.org/package/amazonka-s3-streaming-2.0.0.0/candidate? If it's good to go, I'll publish it, and I guess I should announce it somewhere...

jhrcek commented 1 year ago

Thanks for uploading the candidate! I tried to switch to it, but didn't figure out how to depend on candidate package on hackage in stack project.

But I tested it by using the current master of your repo in stack.yaml:

extra-deps
    - url: https://github.com/axman6/amazonka-s3-streaming/archive/12f8f6cf31f1fe995608393805ecd6ff2b7d8498.tar.gz

I only have one quibble with too-restrictive lower bound four resourcet (see this PR: https://github.com/axman6/amazonka-s3-streaming/pull/40). With that change our project builds and everything works. Could you, please release that change in the release?

axman6 commented 1 year ago

Done, and new candidate uploaded. If you have time can you do a quick smoke test with your stack build, and if that's all good, I'll publish. it's just a .0 release so people should expect brokeness 🙃

axman6 commented 1 year ago

(Decided to reopen this until we actually have a release)

jhrcek commented 1 year ago

Thanks! Works for me. Feel free to promote the candidate and close this issue :bow:

axman6 commented 1 year ago

I'll publish this, but I think the docs could probably do with some love too... I'm also not super happy with the return type for streamUpload, just because it makes the docs so long. I can live with it for now, but maybe in a 2.1 release I'll add a data type representing the the possible results.

axman6 commented 1 year ago

2.0.0.0 published!

axman6 commented 1 year ago

(Pinging @endgame, @jhrcek , @domenkozar, @koterpillar, @mpickering, @roberth as you've all helped get this here, and I know some of you use this for actual real systems [scary] - thanks again to you all, it's been a long time coming).