terricain / aioboto3

Wrapper to use boto3 resources with the aiobotocore async backend
Apache License 2.0
743 stars 76 forks source link

Improve upload_fileobj performance #299

Closed JohnHBrock closed 1 year ago

JohnHBrock commented 1 year ago

Changing the file reader buffer from bytes to bytearray significantly reduces CPU usage. Using bytes is inefficient because it's immutable: you get the classic string building problem, where repeatedly appending to an immutable sequence requires O(n^2) operations.

From the python docs:

if concatenating bytes objects, you can similarly use bytes.join() or io.BytesIO, or you can do in-place concatenation with a bytearray object. bytearray objects are mutable and have an efficient overallocation mechanism

I tried io.BytesIO too, but bytearray has slightly better performance in my testing.

terricain commented 1 year ago

Looks good, makes total sense. Am away at kubecon at the moment but assuming the tests pass, I'll look at getting this merged when I'm back

JohnHBrock commented 1 year ago

Hi, just a reminder to review this when you get a chance. Thanks!

terricain commented 1 year ago

Sorry for the delay. Yep looks good