SIGABRT in Aws::S3Crt::Model::GetObjectRequest when partSize is configured to 5GB

rohansuri commented 2 months ago

Describe the bug

I want to turn off S3CrtClient's default behaviour of doing multi-range GETs and multi-part PUTs. So I set Aws::S3Crt::ClientConfiguration.partSize set to 5GB. So that only for objects greater than 5GB, will a multi-part PUT will happen (5GB is chosen because that is the size limit of a single PUT call). Otherwise, I want only a single PUT/GET to happen for objects lesser than 5GB.

However, my application is crashing. As per the stack trace, it seems the client is automatically doing a ranged GET of 5GB in size and trying to allocate a buffer of that size, which results in an assertion failing as the buffer pool limit is only 2GB, resulting in SIGABRT.

Here's the stack trace:

Fatal error condition occurred in /Users/rohan/cb/aws-sdk-cpp/crt/aws-crt-cpp/crt/aws-c-s3/source/s3_buffer_pool.c:258: size <= buffer_pool->mem_limit Exiting Application ################################################################################ Stack trace: ################################################################################ 1 libaws-c-common.1.0.0.dylib 0x000000010189dcec aws_backtrace_print + 200 2 libaws-c-common.1.0.0.dylib 0x0000000101860318 aws_fatal_assert + 104 3 libaws-c-s3.1.0.0.dylib 0x00000001011de210 aws_s3_buffer_pool_reserve + 204 4 libaws-c-s3.1.0.0.dylib 0x00000001011d59f8 s_s3_auto_ranged_get_update + 1360 5 libaws-c-s3.1.0.0.dylib 0x00000001011f4ba4 aws_s3_meta_request_update + 220 6 libaws-c-s3.1.0.0.dylib 0x00000001011e660c aws_s3_client_update_meta_requests_threaded + 348 7 libaws-c-s3.1.0.0.dylib 0x00000001011e9b08 s_s3_client_process_work_default + 1312 8 libaws-c-s3.1.0.0.dylib 0x00000001011ea5d0 s_s3_client_process_work_task + 324 9 libaws-c-common.1.0.0.dylib 0x00000001018a6760 aws_task_run + 308 10 libaws-c-common.1.0.0.dylib 0x00000001018a6e7c s_run_all + 512 11 libaws-c-common.1.0.0.dylib 0x00000001018a7ac8 aws_task_scheduler_run_all + 88 12 libaws-c-io.1.0.0.dylib 0x00000001017dd434 aws_event_loop_thread + 1676 13 libaws-c-common.1.0.0.dylib 0x000000010189ef6c thread_fn + 532 14 libsystem_pthread.dylib 0x0000000197c66f94 _pthread_start + 136 15 libsystem_pthread.dylib 0x0000000197c61d34 thread_start + 8

Using lldb to print the meta_request->part_size:

(lldb) f 5 frame #5: 0x0000000100da99f8 libaws-c-s3.0unstable.dylib`s_s3_auto_ranged_get_update(meta_request=0x0000000142b671f0, flags=2, out_request=0x000000016ff05ae8) at s3_auto_ranged_get.c:301:38 298 even if expect to receive less data. Pool will 299 reserve the whole part size for it anyways, so no 300 reason getting a smaller chunk. / -> 301 ticket = aws_s3_buffer_pool_reserve( 302 meta_request->client->buffer_pool, (size_t)meta_request->part_size); 303
304 if (ticket == NULL) { (lldb) p meta_request->part_size (const size_t) 5368709120

The buffer pool limit being:

(lldb) f 4 frame #4: 0x0000000100db2210 libaws-c-s3.0unstable.dylib`aws_s3_buffer_pool_reserve(buffer_pool=0x00006000009e4a90, size=5368709120) at s3_buffer_pool.c:258:5 255 } 256
257 AWS_FATAL_ASSERT(size != 0); -> 258 AWS_FATAL_ASSERT(size <= buffer_pool->mem_limit); 259
260 struct aws_s3_buffer_pool_ticket *ticket = NULL; 261 aws_mutex_lock(&buffer_pool->mutex); (lldb) p buffer_pool->mem_limit (size_t) 2013265920

Expected Behavior

No crash should happen.

Current Behavior

Application crashes.

Reproduction Steps

Aws::S3Crt::ClientConfiguration config;
config.partSize = 5 * 1024 * 1024 * 1024ULL;
client = std::make_unique<Aws::S3Crt::S3CrtClient>(config);
Aws::S3Crt::Model::GetObjectRequest request;
request.SetBucket(bucketName);
request.SetKey(key);
auto outcome = client->GetObject(request); // CRASH

Possible Solution

No response

Additional Information/Context

I also tried setting config.downloadMemoryUsageWindow = 100 * 1024 * 1024; but has no effect and I still see the crash.

What I am looking for is to not do ranged GETs for objects smaller than 5GB. Similarly, not do multi-part PUTs for objects smaller than 5GB.

AWS CPP SDK version used

1.11.411

Compiler and Version used

Apple clang version 15.0.0 (clang-1500.3.9.4)

Operating System and version

MacOS 14.4.1

rohansuri commented 2 months ago

As an aside, is it possible to turn off the auto_ranged_get behaviour in S3CrtClient and do a single GET?

DmitriyMusatkin commented 2 months ago

What you are looking for is multipart_upload_threshold, but unfortunately it has not been exposed on cpp sdk side yet. We can use this ticket to track exposing it in crt client config.

CRT uses part size as a hint to buffer pooling allocator, which is used to avoid allocating mem over and over again for buffers. by setting part size to 5gb you are blowing past the default buffer pool budget of 2gb and looks like we are missing sanity checking somewhere to error out in this case. You can increase buffer budget by setting memory_limit_in_bytes, but that will use up a lot of mem if you are setting part sizes to 5GB. In general setting part size is not a recommended approach to control when splitting happens or not.

On a side note, why do you want to disable MPU behavior? Using regular low level s3 client in this case might be an alternative depending on what you want to achieve.

rohansuri commented 2 months ago

Thank you @DmitriyMusatkin for looking into this.

On a side note, why do you want to disable MPU behavior? Using regular low level s3 client in this case might be an alternative depending on what you want to achieve.

I want to save on number of PUT/GET API calls done as every call has a cost associated to it. I know S3CrtClient is built to maximise throughput by doing MPU, but I'm ok to sacrifice it. I was thinking of still prefering to use S3CrtClient over S3Client because:

If and when my objects get larger tha 5GB, it'll do the MPU automatically for me, part sizes being 5GB. S3Client AFAIK, doesn't do it automatically.
Its internal implementation is based on event loop instead of a thread pool and so should be more efficient, at least in theory.

Thank you for creating a ticket to expose the multipart_upload_threshold config. Once exposed, setting that config alone should be enough to avoid S3CrtClient automatically doing MPU and ranged GETs, is that right?

Also just want to confirm, once I've set the multipart_upload_threshold to 5GB. For a PUT request, I hope the client will not require bringing in the entire request body into memory (consuming 5GB), in order to make the request? Rather it'll happen in a streaming fashion based on whatever buffer size is chosen by the client?

As an additional request, it seems the low level s3_client in aws-c-s3 also supports specifying the partSize on a per request basis? Would it be possible to expose that as well through the upper layers? The use case could be that for certain kind of objects in a S3 bucket one does care about throughput, but for some other one doesn't.

rohansuri commented 2 months ago

@sbiscigl I do have this issue as well. Appreciate your help!

sbiscigl commented 2 months ago

Thank you for creating a ticket to expose the multipart_upload_threshold config. Once exposed, setting that config alone should be enough to avoid S3CrtClient automatically doing MPU and ranged GETs, is that right?

MPUs yes, ranges GETs no, but that configuration is exposed now and will be tagged later. researching the other questions.

DmitriyMusatkin commented 2 months ago

I dont think that use case is something that supported very well in crt right now. CRT is build around the idea of having a bunch of data being read into buffers and worked on in parallel. All data needs to be in the buffer before it can be sent and CRT does not currently stream data from source directly to s3 (there are various reasons for this and something that can be improved). So in case of 1 single chunk upload, CRT will end up with having to load the whole chunk into the buffer and then dispatch it to a single connection that will send it to s3. So unfortunately with 5GB threshold it will attempt to load all 5GB into mem. Its not the primary usecase crt was designed for, but something i think we should improve. Dont think there is any settings you can set to mitigate this in a graceful way. We probably should move this issue to aws-c-s3 as there is not much cpp sdk can do here.

rohansuri commented 2 months ago

Thank you both for your detailed replies.

CRT is build around the idea of having a bunch of data being read into buffers and worked on in parallel. All data needs to be in the buffer before it can be sent and CRT

Understood, so it seems even after exposing multipart_upload_threshold - memoryLimitBytes or downloadMemoryUsageWindow won't have any effect and we will still require a buffer of size 5GB in the worst case.

@DmitriyMusatkin I'd just like to confirm the same with S3Client (and not S3CrtClient), as I plan to switch to using it:

it will never do MPU or range gets by itself
it supports doing large PUTs/GETs (let's say of size 5GB), without having to bring in the entire data into memory?
is ensured to always issue a single PUT/GET on S3 no matter the size of the object?
is there a ClientConfiguration available to set the buffer size to aid in this streaming?

Additionally, a suggestion is to document these facts in the README, about S3CrtClient vs S3Client in terms of their buffering requirements.

DmitriyMusatkin commented 2 months ago

Yes, multipart_upload_threshold will allow you to force crt to only do one request, but it will still need to allocate memory for the whole request. Windowing stuff affects how many reqs crt runs in parallel to not overwhelm consumer, but it will not allow you to force download to complete in 1 api call.

base low level s3 client does not break up requests under the covers. it does one http request per 1 api call
you can implement your own stream buf and then stream the data as it received whenever it needs to go. but i dont think there is anything out of the box for that (@sbiscigl correct me if im wrong here)
yes
dont think thats exposed. curl will have its own default on top of OS buffer size here.

jmklix commented 2 months ago

I noticed you also opened this issue here about using the non crt S3 client. We are still working on fixing that, but I wanted to ask if you had any other questions related to s3crt?

rohansuri commented 2 months ago

@jmklix I'm satisfied with the details @DmitriyMusatkin shared. And decided to use the S3Client.

Although had one more question on S3CrtClient: If I have a 5GB object and partSize is the default of 8MB, how many concurrent upload/download in memory buffers (or chunks) will I have? I'm trying to determine the peak memory consumption of s3Crt.

DmitriyMusatkin commented 2 months ago

CRT pools buffers and has overall limit on mem consumption that is determined from the target throughput and is configurable using mem limit in bytes setting. For any target throughput < 25Gbps, mem limit is set to 2 GB. And it maxes out at 8GB for anything over 75Gbps

aws / aws-sdk-cpp