Open patpatpat123 opened 1 year ago
I believe the underlying limitation comes from jetty http client.
I don't think so.
Jetty's HttpClient
can easily do 10-100 thousands of requests/s.
I would start by using it directly rather than using Flux
.
Also, explicitly configure the thread pool and the number of selectors on ClientConnector
, depending on the hardware spec your client is running on.
Once you achieve the desired number of requests/s you can reintroduce reactive and/or Flux
and see how it goes.
I doubt it has anything to do with the Flux
used to fire the requests, but you can also try replacing it with an Executor
to keep it simple and transparent. More suggestions, try with other ClientHttpConnector
implementations, as well as directly with Jetty's HttpClient
.
Thanks to your comments.
Just adding some more facts here:
I tried this (note the concurrency number)
// some very fast flux, this line is just an example
Flux<String> inputFlux = Flux.interval(Duration.ofMillis(1)).map(i -> "transform to request payload number " + i);
//send as many requests as fast as possible, note the small number 4
Flux<String> resultFlux = inputFlux.flatMap(oneInput -> webClient.post().bodyValue(oneInput).retrieve().bodyToMono(String.class), 4);
//doing something with the result
return resultFlux.map(oneResult -> doSomething(oneResult));
And could see in the logs threads like :
[HttpClient@336f49a1-48]
[HttpClient@336f49a1-50]
[HttpClient@336f49a1-47]
[HttpClient@336f49a1-49]
i.e, with flatMap set to 4, it seems there are 4 threads from Jetty HttpClient (please correct me if I am wrong)
Now, if I increase to 8 Flux<String> resultFlux = inputFlux.flatMap(oneInput -> webClient.post().bodyValue(oneInput).retrieve().bodyToMono(String.class), 8;
, I do see 8 different [HttpClient@abc-N]
Therefore, I think there is some kind of correlation between the flatMap concurrency number and this [HttpClient@abc-N]
However, as I scale up, 16, 32, 64, 128 [...] at some point I am not able to see the same number of [HttpClient@abc-N]. With concrete examples, if I set flapMap at 4096, I would have expected 4096ish different [HttpClient@abc-1] [HttpClient@abc-4096]
However, not even a hundred could be observed.
May I ask if this is expected?
I'm not sure what that parameter does exactly, so I cannot comment.
If you're trying to perform some kind of load testing, I feel you are taking the wrong way.
Have you tried HttpClient
alone to keep things simpler?
No problem at all, and again, thank you for all the responses.
I am not doing some kind of load testing, this is a real production level business use case "hit the third party server as hard as possible"
I believe I have found further clues.
I am now doing trials and errors (mostly errors, the parameters are a bit overwhelming)
Adding: (note the custom QueuedThreadPool)
@Bean
public HttpClient getHttpClient(final MeterRegistry registry) {
final var threadPool = new QueuedThreadPool(1000);
threadPool.setName("client-thread");
final var clientConnector = new ClientConnector();
clientConnector.setExecutor(threadPool);
clientConnector.setReuseAddress(true);
clientConnector.addEventListener(new JettyConnectionMetrics(registry));
return new HttpClient(new HttpClientTransportDynamic(clientConnector));
}
Issue 1: with this QueuedThreadPool set to 1000, I would have expected to see [client-thread-1] to [client-thread-1000] doing the work of sending requests
However, I am only seeing
client-thread-106
client-thread-47
client-thread-68
client-thread-76
client-thread-77
client-thread-85
client-thread-92
client-thread-93
client-thread-96
client-thread-97
Is the property max number of thread not being picked up?
Also, may I ask what is the threadPool, QueuedThreadPool, ExecutorThreadPool, another thread pool? That should be the best fit to this production use case (not load testing) which is to span as much thread as possible in order to send as much requests as possible please?
I think you have wrong expectations.
this production use case (not load testing) which is to span as much thread as possible in order to send as much requests as possible
You are basically trying to max out 2 systems, so it is load testing. Using as many threads as possible is rarely the best solution.
I think HttpClient
is using only the 10 threads you are seeing because it is able to cope with the load with only those threads.
You likely have filled up all your connections and adding more threads won't help.
You need to carefully monitor CPU, network, JVM and application to understand what's going on.
Please read: https://github.com/jetty-project/jetty-load-generator/blob/2.1.x/README.md
Understood @sbordet
This is very unfortunate, using this reactive paradigm, we use little little little resource
kubectl -n=production top pod application-65847cb578-dqnb9
NAME CPU(cores) MEMORY(bytes)
application-65847cb578-dqnb9 39m 271Mi
We are only using a very low amount of CPU and memory, using this Jetty, to send some 16 requests per seconds, with a server confirming to receive 16ish requests per seconds (while it should be able to handle 8000/s), while the flux of data incoming in piling up and Jetty is not able to send them
May I ask if there are some documentations on the executor, thread pool and selector please?
Look, it's not Jetty. There is something else wrong.
The documentation is here: https://www.eclipse.org/jetty/documentation/jetty-10/programming-guide/index.html Look at the Jetty architecture for details on threads and selectors: https://www.eclipse.org/jetty/documentation/jetty-10/programming-guide/index.html#pg-arch
Make sure you have a large maxConnectionsPerDestination
and maxRequestsQueuedPerDestination
.
Then make sure you actually send the requests without waiting for the responses.
As I said multiple times now, start with Jetty HttpClient
alone, no reactive.
Make sure you can hit the desired numbers with it.
Then introduce reactive.
What I am trying to achieve:
Send as many http requests as possible, in parallel, to a very reliable third party service from an aggressive Flux
Background:
The third party service is very reliable and can sustain a very high number of requests. So far, I am the only client of this third party service. I am invited to hit the third party server as hard as possible. The third party service, which I have no control over, does not offer any bulk/list API. I can only send requests one by one. On their side, each request takes a constant one second to process.
This third party API has keep-alive enabled. But does not support GRPC, http2, socket. There is no rate limit at any point of the end to end flow.
What did I try:
Here is the configuration of my http client, as well as the logic to send the http requests (hopefully as many requests, as fast as possible)
client
the flux:
Using this, I asked the third party service and they gave me a number N, my number of requests per second.
First observation, the number of concurrency for flatmap here is 4096. And since the third party takes one second to process the request, I would have expected a rate N of 4096 requests per second.
However, I am nowhere close. The third party service told me I am at 16ish requests per second.
Issue:
I believe the underlying limitation comes from jetty http client. I interchanged the webclient (which uses jetty) with a dummy operation, and could see much higher throughput.
I believe the issue here is that the scheduling policy of the jetty-reactive-httpclient library is limiting the throughput. What parameters, the number of concurrent connections, possibly IO threads, keep alive, I should use in order to "unleash" jetty reactive http client?