Open samdengler opened 1 year ago
Hi @samdengler, thanks for reaching out. I did some experiments with exposing a warmUp (naming subject to change of course) API on the SDK client that preloads all SDK classes, some Jackson classes if it is a JSON based service. Customers can also provide an optional list of prime functions to invoke in case they want to warm up the connection pool. Would this work for your use case? Feedback is welcome!
The code looks like below:
DynamoDbClient client = DynamoDbClient.create();
WarmUpConfiguration configuration = WarmUpConfiguration.builder()
.initializeClasses(true)
.preloadClasses(true)
.primeFunctions(client::listTables)
.build();
client.warmUp(configuration);
Hi @zoewangg, thanks for the quick reply. What would the most conventional/default experience be? Something like:
DynamoDbClient client = DynamoDbClient.create();
WarmUpConfiguration configuration = WarmUpConfiguration.builder().build();
client.warmUp(configuration);
Also, any thoughts to the approach that X-Ray uses? https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-java-awssdkclients.html
this would be an awesome feature to have!
We need this!
@zoewangg I wonder if there's a different approach where one can set a global SDK config, so that all newly created clients are primed by default?
I do note that you've broken the warming up into different pieces here e.g. initializeClasses
, etc. A global setting might be more abt the initializeClasses
and preloadClasses
part but not the primeFunctions
part as it seems to be client specific.
I feel like the global option might be beneficial in cases where the user is using a large number of AWS SDK Clients
This should also be expanded to other AWS Services as well.
What we found is that even if the clients are set at Singleton, the first call (without any priming done) still takes a long time. So having this auto priming feature would be really helpful especially for cold-start scenarios.
Just to add to this as well since my use case is fairly similar, but rather than Lambda we're running on EKS.
In my scenario it's a pretty small Spring Boot app that has a low memory footprint (~30-40 MB heap used, 130 MB default heap size) and thus the Kubernetes pods only request / limit the resources to 0.25 CPU and 256 MB RAM. One thing we've noticed it that several SDK libraries are really slow on the first invocation, but become very fast subsequently.
Some examples of this are:
AwsCredentialsProvider#resolveCredentials
where the first (and active) WebIdentityTokenFileCredentialsProvider
takes up to 3 seconds to get credentials. SqsAsyncClient#sendBatch
can take 1-3 seconds synchronously, with an extra 3-6 seconds after that happening asynchonously. The next 1-2 requests will take 30-100 ms, while subsequent requests only take about 2-3 ms.DynamoDbTable#getItem
takes about 300 ms, and the first call to DynamoDbTable#query
will take another 300 ms. Subsequent calls take typically 5-20 ms, with outliers going up to 80 ms. I did some investigating into this and found that just constructing objects like Update
or PutItemEnhancedRequest
or running DynamoDbTable#query
but not actually fetching any results (so no network call) each take about 100 ms.I managed to cut these out by essentially "priming" these SDKs by resolving credentials, doing a describe endpoints call against DynamoDB, generating all of my models and converting them into relevant Put
/ PutItemEnhancedRequest
/ SdkIterable<Page<T>>
objects, calling SqsAsyncClient#sendMessage
against a queue that doesn't exist, etc. This saves about 4-6 seconds on the first request, but it's a bit inconvenient to maintain this logic myself, and would be a nice to have as part of the SDK, so everything can be eagerly loaded as my container is starting as an optional aspect of the SDK client.
This would be amazing!
Describe the feature
At CapitalOne, we've observed notable performance benefits when priming AWS SDK clients in conjunction with Lambda SnapStart snapshotting, however the practice of artificially constructing AWS SDK clients and invoking classes during the Lambda initialization phase is nonintuitive for engineers to discover and implement. If there was an option to auto-prime AWS SDK clients, we can more easily ensure that engineers are following a best practice by default and realize a better performance experience with the Lambda Java runtime when using the AWS SDK for Java 2.0
Use Case
I'm frustrated when I need to include artificial code, like DynamoDBAsyncClient.describeEndpoints, in during Lambda function initialization that constructs and invokes AWS SDK client classes to be included in the Firecracker snapshot for a performance benefit during invocation of the Lambda handler.
Proposed Solution
A JAVA_TOOLS configuration for AWS SDK for Java 2.0 client would be an easy mechanism for our engineers to manage auto-priming across compute using the SDK, however our focus is on Lambda SnapStart.
Other Information
No response
Acknowledgements
AWS Java SDK version used
2.19.26
JDK version used
openjdk 11.0.17 2022-10-18 LTS
Operating System and version
Java 11 Lambda runtime