aws / aws-sdk-java-v2

The official AWS SDK for Java - Version 2
Apache License 2.0
2.2k stars 846 forks source link

CachedSupplier should cache exceptions briefly to limit load on downstream systems #5690

Open dbottini opened 3 months ago

dbottini commented 3 months ago

Describe the feature

CachedSupplier used by various other AWS components to cache the results from expensive AWS operations, such as fetching STS credentials. It should be able to provide some amount of "negative cacheing" on exceptions to be able to limit how much load can be pressed against systems like STS. Otherwise, under high load, there will effectively be an unending torrent of requests, as the Lock will queue up requests to be attempted immediately after the previous request failed

Use Case

In our situation (large enterprise AWS account), it wraps STS:AssumeRole. On the happy path with 2xxes, it works great, but we've found that it is over-eager to hammer APIs on exceptions. In our case, it was 403 because a role was misconfigured. Typical use of STS:AssumeRole is to use the StsAssumeRoleCredentialsProvider, inserted into a given Aws Client (in our case Kinesis). When there is no entry/no non-stale entry, every request will get passed through to the CachedSupplier. The existing lock is most effective when there are a few slow requests, and if total throughput/waits on the request including backoffs is less than five seconds. If the cumulative time per request, times number of requests, exceeds five seconds, the lock basically becomes a no-op and allows unfettered hammering of the APIs. As the cached supplier's concurrency controls start to fail under high, continuous request rates, common AWS failure strategies like retry+backoff+jitter start to fail as each parallel request lacks context about how many other of these expensive requests are failing elsewhere. We could solve this with our own circuit breakers around any client that incorporates StsAssumeRoleCredentialsProvider but I believe it would be more effective to stem the requests at a more base level.

Proposed Solution

https://github.com/aws/aws-sdk-java-v2/compare/master...dbottini:aws-sdk-java-v2:dbottini/cached-supplier-caches-exceptions I have created a branch with a proposed solution that will briefly cache exceptions when there is no non-stale value.

Other Information

No response

Acknowledgements

AWS Java SDK version used

2.25.7

JDK version used

Corretto-21.0.3.9.1 (build 21.0.3+9-LTS)

Operating System and version

macOS Sonoma 14.5 (23F79)

bhoradc commented 3 months ago

Hi @dbottini,

Thanks for submitting the Feature request. We'll discuss this further with the Java SDK team and get back to you.

Regards, Chaitanya

bhoradc commented 2 months ago

Hi @dbottini,

Based on the discussion with the Java SDK team, it seems that the proposed negative caching feature for the CachedSupplier is considered a cross-cutting concern that should be addressed at a higher level, rather than specific to the Java SDK.

Moving the feature request to a shared repository (aws-sdk) for further discussion across different AWS SDK teams makes sense, as it would allow for a more consistent and coordinated approach to handling this issue across multiple language SDKs.

Thanks again for submitting this feature request.

Regards, Chaitanya

amberkushwaha commented 2 months ago

self announced projects were paste drop or click to add files into the limit load in downstream systems and project for that.

def get_presigned_url(bucket, object_key) url = bucket.object(object_key).presigned_url(:put, expires_in: 300, server_side_encryption: "AES256", tagging: "key1=value1") puts "Created presigned URL: #{url}." URI(url) rescue Aws::Errors::ServiceError => e puts "Couldn't create presigned URL for #{bucket.name}:#{object_key}. Here's why: #{e.message}" end

markdown is supported in this file for paste drop and click to add files into it.contributing guidelines are also different in many ways.

remember the contributions for the file and codding concept as to security guidelines and the policy should also be changed in many ways.

debora-ito commented 6 days ago

Transferring back to the Java SDK 2.x repo.