aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.55k stars 4.13k forks source link

[CRT] - Allow setting max bandwidth limit for S3 transfers #8974

Open chipitsine opened 2 weeks ago

chipitsine commented 2 weeks ago

Describe the bug

I'm trying to limit S3 bandwidth for CRT runtime described in https://aws.amazon.com/blogs/storage/improving-amazon-s3-throughput-for-the-aws-cli-and-boto3-with-the-aws-common-runtime/

I used the following command (from the article)

aws configure set s3.target_bandwidth 50MB/s

despite that I see much higher bandwidth (limited by hard disk drive actually)

image

aws cli version: aws-cli/2.17.61 Python/3.12.6 Windows/2022Server exec-env/EC2 exe/AMD64

in classic (non crt) mode bandwidth limiting works like a charm.

Regression Issue

Expected Behavior

limiting bandwidth in crt mode

Current Behavior

not limiting (even configured as per article)

Reproduction Steps

install aws cli aws configure set s3.preferred_transfer_client crt aws configure set s3.target_bandwidth 50MB/s

copy files to s3, verify speed

Possible Solution

No response

Additional Information/Context

No response

CLI version used

aws-cli/2.17.61 Python/3.12.6 Windows/2022Server exec-env/EC2 exe/AMD64

Environment details (OS name and version, etc.)

win 2022

tim-finnigan commented 2 weeks ago

Thanks for reaching out. I could reproduce the behavior you described. This issue was discussed further internally and the takeaway was that the target_bandwidth configuration for CRT is a hint of how much resources it should allocate and is not an active threshold that the CRT tries to maintain, so in practice it is possible for exceed the target threshold.

The CRT doesn't currently have a way to force bandwidth to stay under a certain limit (such as the max_bandwidth config for the classic transfer client.) Also the documentation for target_bandwidth notes that it "controls the target bandwidth that the transfer client will try to reach for S3 uploads and downloads." It is not guaranteed to be precise and cannot force the bandwidth to stay under a max limit.

But the CLI team has acknowledged the request to support configuring some kind of bandwidth throttling or upper bound on the concurrency/throughput so that other processes can still run. So I will update this issue as a feature request to track. The current documentation could also be potentially improved for clarity. If you have any other questions or feedback please let us know. In the meantime you can use the classic transfer client, or consider trying other options such as using DataSync and setting a bandwidth limit for an S3 transfer.

chipitsine commented 2 weeks ago

I tried running with "--debug" option, I did not find any evidence of trying to reach target bandwidth. How can I see whether is it trying or not ?

image
chipitsine commented 2 weeks ago

I tried to find reference to "target_bandwidth" in https://github.com/awslabs/aws-c-s3

I wanted to have a look at the code

image
tim-finnigan commented 2 weeks ago

You can find target_bandwidth in the AWS CLI codebase. When you are looking at logs with --debug it should be reflected in s3transfer.crt - DEBUG - Using CRT throughput target in gbps: 0.4194304. For questions regarding the CRT code specifically feel free to open an issue or Discussion in that repository.