reactiverse / aws-sdk

Using vertx-client for AWS SDK v2
Apache License 2.0
49 stars 14 forks source link

VertxNioAsyncHttpClient vs AwsCrtAsyncHttpClient #64

Closed phillycoder closed 1 year ago

phillycoder commented 1 year ago

I have vertical querying a dynamodb, i tested it using both AwsCrtAsyncHttpClient and VertxNioAsyncHttpClient.

AwsCrtAsyncHttpClient seems 55% faster than VertxNioAsyncHttpClient.

  1. DynamoDbAsyncClient using AwsCrtAsyncHttpClient (inside my Verticle start method)

    DynamoDbAsyncClient dbAsyncClient =
            DynamoDbAsyncClient.builder()
                    .httpClient(AwsCrtAsyncHttpClient.builder().build())
                    .build();

    Throughput = 12.4K rps

  2. DynamoDbAsyncClient using AwsCrtAsyncHttpClient (inside my Verticle start method)

    DynamoDbAsyncClient dbAsyncClient = VertxSdkClient.withVertx(
                        DynamoDbAsyncClient.builder()
                        , context)
                .build();

    Throughput = 7.7K rps

Not sure if i am doing something wrong, is it expected? What is primary advantage of using VertxSdkClient vs building DynamoDbAsyncClient using aws sdk directly?

phillycoder commented 1 year ago

I also tested same app with default Nettyclient using aws sdk directly, getting throughput around 8.2k rps

        DynamoDbAsyncClient dbAsyncClient =
                DynamoDbAsyncClient.builder()
                        .build();
aesteve commented 1 year ago

Hello @phillycoder.

Perf issue

Unfortunately, I won't have much time in the upcoming weeks / months to investigate such a performance issue. This takes quite a lot of time and effort that I won't have at my disposal. If you ever feel like performing this investigation and don't know where to start, I recommend using the approach as well as tools mentioned in this presentation.

I'm not sure which of the improvements mentioned in AWS blog makes such a big difference in performance. Maybe the re-use of connections and DNS-aware capabilities do. I can't really tell.

A few open questions (again, if you want to investigate this further down):

They mention faster startup time and DNS resolution in their article if I'm not mistaken, how long has your perf testing been running for? Long enough so that the startup time doesn't account for much? Does running the test longer change the results?

Vertx AWS SDK benefits

There's a single one really:

CompletableFuture<?>'s are executed in the same Vert.x context that the one that made the request

Vert.x has the notion of Context (more info here).

The two main things to worry about from the documentations are:

This means (in the case of a standard verticle) that the verticle code will always be executed with the exact same thread, so you don't have to worry about multi-threaded acccess to the verticle state and you can code your application as single threaded.

This class also allows arbitrary data to be put(java.lang.Object, java.lang.Object) and get(java.lang.Object) on the context so it can be shared easily amongst different handlers of, for example, a verticle instance.

Using this project, you get a context-aware Executor that gets passed as future completion executor to the AWS SDK: https://github.com/reactiverse/aws-sdk/blob/master/src/main/java/io/reactiverse/awssdk/VertxSdkClient.java#L23

Same goes with HTTP request / responses: https://github.com/reactiverse/aws-sdk/blob/master/src/main/java/io/reactiverse/awssdk/VertxNioAsyncHttpClient.java#L60

If you don't care about this context guarantee, then I guess any non-blocking HTTP client would do just fine, no need to use this project.

An interesting experiment could be the following:

var builder = DynamoDbAsyncClient.builder();
var client = VertxSdkClient.withVertx(builder, context)
    .httpClient(AwsCrtAsyncHttpClient.builder().build())
    .build();

This way you would get the AWSCrtAsyncHttpClient (no context guarantee), but would get the Future Executor context guarantee.

I'm sincerely not sure what kind of bugs this could involve but that might be worth trying.


Again, sorry I don't have time to investigate more, but I hope this helps.

phillycoder commented 1 year ago

@aesteve no worries regarding lack of time and thanks a lot for taking time to reply.

To be honest i am fairly new to vertx (or non reactive programming). Currently i am looking into rewriting an existing rest api (using tomcat/jersey/dynamodb) to vertx/dynamodb to see if non-reactiveapproach gives any performance boost.

Perfomance test i did only for 20s (after warming up the api initially for 20 secs), i deployed the api as docker container in an ec2 and used k6 perf tool with 40 concurrents users.

Thanks for sharing benchmark youtube video and i will learn about importance of sharing context. For now i can close this issue and if i find anything more regarding this i will come back to this thread.