esmero / strawberryfield

A Field of strawberries
GNU Lesser General Public License v3.0
10 stars 5 forks source link

Code around the 5Gbyte S3 Put limit present right now on the S3FS (and any other module) copy/put implementation #286

Closed DiegoPino closed 8 months ago

DiegoPino commented 9 months ago

What?

New to me. And an urgent one. Happens that the S3FS s3:// wrapper mostly inherits the main AWS S3 stream wrapper implementation which has (bug according to me) this issue here related to putobject operations and copyoperations between buckets https://github.com/aws/aws-sdk-php/issues/2207 partially fixed on the rename method here https://github.com/aws/aws-sdk-php/pull/763

This was reported as "let's get back to it" by the S3FS team IF someone had the need for this (3 years ago) with a tiny documentation entry (really tiny, not specifying that "copy" was also affected..)

This has some larger repercussions for users ingesting AMI sets with already uploaded files larger than 5 Gbytes but, uploaded to a temporary location. Or users having local files larger than 5 Gbytes.

There are 3 possible solutions for this:

Any ideas? This is important @alliomeria

dmer commented 9 months ago

We've encountered this problem on the California project with some video files over 5GB - I'm happy to have an explanation as to what the problem is! Too bad the S3FS doesn't look they are going to fix.

We'll be happy to test any solutions that come out of this - I asked Pat and he's expressed a mild preference for getting s3fs to accept a PR. I don't think this is high enough on the issues/priorities for that project for us to get to spend any dev time on it, but I'll keep an eye out for updates.

DiegoPino commented 9 months ago

Ok. I have a solution. Code coming soon

DiegoPino commented 9 months ago

Small update. The solution works flawlessly! Now some config settings to allow the multipart to kick in sooner if you want to and more testing + some docs. Adios 5G Bytes limit.

DiegoPino commented 8 months ago

Solved via https://github.com/esmero/strawberryfield/commit/6a5cde3e3e7f4a180d53d14b56e7a3e5d88ccac5 and https://github.com/esmero/strawberryfield/commit/6e3824cc9ad547692eb134a0764d6ebc4c35b896