terricain / aioboto3

Wrapper to use boto3 resources with the aiobotocore async backend
Apache License 2.0
698 stars 73 forks source link

upload_fileobj is always mutlipart #333

Closed plotlogic-andrew closed 1 month ago

plotlogic-andrew commented 2 months ago

Description

upload_fileobj always does multipart uploads. This results in:

What I Did

    import boto3
    import aioboto3

    file = 'test.txt'
    with open(file, "wb") as fp:
        pass

    bucket, key = ...

    s3_client = boto3.client('s3') 
    s3_client.upload_file(file, bucket, key + '/test-sync.txt')

    async with aioboto3.Session().client('s3') as s3_client:
        await s3_client.upload_file(file, bucket, key + '/test-async.txt')

The uploadfile_obj code make reference to multipart_chunksize but doesn't use it.

terricain commented 2 months ago

Yeah I've not looked at the S3 transfer code in years but im sure it should not multipart if the file is small enough. Will look at fixing it at some point

plotlogic-andrew commented 2 months ago

I'd call it a feature request more than a bug :-).

I'm happy to contribute - but after a quick look the code for upload_fileobj coming from botocore seemed opaque at best to me. I'll look again when I have more time to.

And thanks for putting this together! I've always been bothered that boto3 defaults to 10 threads to keep an upload pipe full.

terricain commented 1 month ago

s3.upload_fileobj now respects Config.multipart_threshold and will issue a singular s3.put_object if the file is below the threshold.