Open ptirador opened 3 years ago
The AWS Java SDK 2.x includes a pluggable HTTP layer that allows customers to switch to different HTTP implementations. Three HTTP clients are supported out-of-the-box:
With the default configuration, Apache HTTP client and Netty HTTP client are used for synchronous clients and asynchronous clients respectively. They are powerful HTTP clients with more features. However, they come at the cost of higher instantiation time.
On the other hand, the JDK built-in HTTPUrlConnection
library:
Hence, it's recommended using HttpUrlConnectionClient
when configuring the SDK client. Note that it only supports synchronous API calls. If we'd like to see support for asynchronous SDK clients with JDK 11 built-in HTTP client, please upvote this GitHub issue.
The SDK by default includes Apache HTTP client and Netty HTTP client dependencies. If startup time is important to your application and you do not need both implementations, it's recommended excluding unused SDK HTTP dependencies to minimize the deployment package size. Below is the sample Maven POM file for an application that only uses url-connection-client
and excludes netty-nio-client
and apache-client
.
<dependencies>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<exclusions>
<exclusion>
<groupId>software.amazon.awssdk</groupId>
<artifactId>netty-nio-client</artifactId>
</exclusion>
<exclusion>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>url-connection-client</artifactId>
</dependency>
</dependencies>
As the JDK built-in HTTPUrlConnection
client is more lightweight, its configuration is simpler. If compared to Apache HTTP Client, for example, you cannot configure:
FYI @carlspring @steve-todorov
Hi @ptirador ,
Thanks for your investigation!
What do you mean by "deployment package"?
In my opinion, we need to have support for both synchronous and asynchronous requests. If the we need the Apache + Netty dependencies for this, then so be it. There are many other things that you can't do with the HTTPUrlConnection
like setting up connection pools and so on, (if I recall correctly).
How much of a difference is there in terms of instantiation time?
And the other question -- are we using async requests for anything right now? What use cases would we have for this?
My only concern is that, at the moment, we claim to support JDK11 (which is, of course indeed the case), and, whatever we decide will have to make sure this does not break out JDK 11 support.
Which one is your advice and personal preference?
Thanks @ptirador for raising this issue and making the initial research!
How did you come to the conclusion using the built-in HttpUrlConnection
client is faster?
Did you do a JMS
benchmark which backs this statement with data?
Honestly, if I had to pick one of the three options above - I'd go with netty-nio-client
and async
connections as the default option. In my experience, using netty
and proper async implementation would result in much better throughput and overall performance than using blocking / sync approach. Also, if you're already using Cassandra or something similar the chances you are already using netty
are very big.
If you are up for the task - we can create a JMS
benchmark which tests the different implementations so we can make a decision based on the data.
Hi @carlspring @steve-todorov,
The conclusions that I wrote are based on this article, which talks about these instantiation times but without providing any benchmarch example. We can create this JMS benchmark to test them.
In my opinion, I will also go with Netty and async connections, specially because of the overall performance boost that it provides. Also, a few months ago we switched the NIO implementation to use AsynchronousFileChannel instead of FileChannel, so I think it could be the best way to go.
Hi @ptirador ,
I believe you and @steve-todorov are right -- we should use Netty, since indeed we did switch to AsynchronousFileChannel
, as you've just reminded me.
How much of an effort will this task be?
Task Description
The
S3Factory
class manages the build of a new Amazon S3 instance, which right now it's using an Apache HTTP Client.As specified in this Pull Request discussion, this is locking in customers to the
ApacheHttpClient
, which adds a dependency they may not want. It's needed to provide an option for otherSdkHttpClient
implementations.The UrlConnectionHttpClient is fairly popular choice in Java-based Lambda functions as it has faster startup time, so less impact to cold starts.
Tasks
The following tasks will need to be carried out:
SdkHttpClient
implementationsTask Relationships
This task:
Useful Links
Help