awslabs / aws-c-s3

C99 library implementation for communicating with the S3 service, designed for maximizing throughput on high bandwidth EC2 instances.
Apache License 2.0
93 stars 37 forks source link

Mem limiter affecting get throughput in corner cases #425

Open DmitriyMusatkin opened 4 months ago

DmitriyMusatkin commented 4 months ago

Describe the bug

Mem limiter provides a push back mechanism on scheduler if memory usage is getting close to the limit.

With gets there is a chicken and egg problem, since we dont know the size object before doing a get and we want to avoid making additional request to figure out that size before doing a get (cause additional roundtrips for get tank perf). So crt will optimistically do a ranged get with a part size to get a first ranged part and figure out the overall size.

This approach works fine in most cases. But it will unnecessarily slow down gets when part size is huge and gets itself are small. Ex. part size is 1 GB and the files being retrieved are 1mb. Mem limiter in that case would only be able to schedule 4 gets in parallel (assuming 4 gb mem limit), since it would account for the worst case of getting back 1GB part. But in practice we should be able to schedule a lot more gets in parallel, cause they are all small.

refer to https://github.com/aws/aws-sdk-cpp/issues/2922 for example of this in the wild

Expected Behavior

something better?

Current Behavior

download slows down to a crawl on lots of small gets if part size is huge

Reproduction Steps

set part size to a gig and observe downloads on 10k 256kb files

Possible Solution

No response

Additional Information/Context

No response

aws-c-s3 version used

latest

Compiler and version used

every compiler

Operating System and version

every os