Closed martinvisser closed 2 days ago
we would like to use virtual threads for the interaction with the hubs.
Why, just out of curiosity?
Please read the documentation about the virtual thread support: https://jetty.org/docs/jetty/12/programming-guide/arch/threads.html#thread-pool-virtual-threads and also: https://jetty.org/docs/jetty/12/operations-guide/modules/standard.html#threadpool-all-virtual
Long story short, the call to select()
or in your case above KQueue.poll()
pins the carrier, so you probably run out of carriers.
Setting -Djdk.virtualThreadScheduler.parallelism=2
probably exacerbates the problem (not sure exactly I get the difference between "Jetty" and "app" in your report).
I suggest you either don't use virtual threads (there should be no need, as the Jetty HttpClient
is fully asynchronous).
If you really want to use virtual threads, please detail exactly where -Djdk.virtualThreadScheduler.parallelism=2
is set and what is affected by it, since you have a proxy that has both a client and a server.
Also, I recommend to not set the client executor to newVirtualThreadPerTaskExecutor()
, but look into Jetty thread pools and their support for virtual threads.
Why, just out of curiosity?
Our app has quite a high throughput and a lot of I/O with the hubs, so we do not want that one request is going to block another. We came from a fully reactive app, but recent performance degradation made us take the decision to go for Java 21 with virtual threads, and no more Monos and Fluxes.
not sure exactly I get the difference between "Jetty" and "app" in your report
"Jetty" is the http client, the one that created ForkJoinPool workers that get pinned. "app" is the spring boot app, which only created two additional ForkJoinPool worker threads, both shutdown after idling.
I'll definitely have a look at those docs, but do they apply to the (standalone) http client as well? We are running the default server in Spring Boot, so that would be Tomcat. Just tried with Jetty as server, where this same issue occurs when setting parallelism. I don't see this freezing when using default parallelism with either server, so perhaps my way of reproducing the issue we faced is not the correct way. Our original issue was that the app froze after about 60s when idle. Our app was deployed on Cloud Foundry where we set no parallelism, and has 8 cores (not sure if that's relevant or not, but I created the demo app on a 16 core machine). I thought I found a neat way of reproducing it, but maybe this actually causes the freeze in the demo app.
I suggest you either don't use virtual threads (there should be no need, as the Jetty HttpClient is fully asynchronous).
But asynchronous is not necessarily non-blocking. Of course, a virtual thread could still be "blocked", but they don't require new platform threads if so, which does happen under high load.
Btw, the only reason I added the parallelism is to reduce the waiting time to show the issue, we do not do this in production. However, without setting this I do not see this freeze. I thought I found a neat way of reproducing it, but maybe this actually causes it in the demo app.
So maybe my way of reproducing is wrong, but I'm surprised that configuring the executor on the HttpClient can freeze the spring boot application from handling any requests. Perhaps an interesting side note: this freeze also occured when configured Netty as http client and explicitly configured it to run on the virtual thread executor. I'll see if I can create more heap and/or thread dumps to get more information.
Our app has quite a high throughput and a lot of I/O with the hubs, so we do not want that one request is going to block another. We came from a fully reactive app, but recent performance degradation made us take the decision to go for Java 21 with virtual threads, and no more Monos and Fluxes.
Interesting, thanks for sharing.
I'll try the demo repo tomorrow.
If you can capture the JVM thread dump, including the virtual threads, will be great.
I changed the executor to use the QueuedThreadPool
, but with a virtual thread factory. This seems to work well:
I tried your repo, but seems to only have half of the application (the proxy).
Of course the virtual thread factory works, it pools the virtual threads it creates 😄
I believe your problem is the system property limiting the carriers, associated with the select()
operation pinning the carriers.
If you configure Jetty's HttpClient
with 1 selector only, you should be good.
If you can make your repository an auto-contained reproducer that we can try (with instructions), I'll gladly try to reproduce, but so far seems just misconfiguration.
Why do you want to restrict the carriers to only 2
?
I don't want to restrict it, it made me reproduce the issue we noticed when we deployed our app to Cloud Foundry. But as said, maybe this actually creates an issue in my demo.
Ok, so the basic knowledge here is to know that select()
pins carriers.
By default, new HttpClient()
or new HttpClient(new HttpClientTransportOverHTTP())
allocate cores/2 selectors.
However, new HttpClient(new HttpClientTransportOverHTTP(new ClientConnector))
allocates only 1 selector.
We definitely need to align the number of selectors.
Having said that, the number of selectors and the number of carriers must be carefully chosen.
I'm not sure why you have a problem deploying to Cloud Foundry, unless you explicitly set the number of selectors, and they are more than the number of cores, or you explicitly set the number of carriers, and they are less than the selectors.
I would recommend you to use just 1 selector, unless you have a large number of concurrent connections (say larger than 5k), and that should work as long as you have more than 1 core.
If you have only 1 core, then you must configure jdk.virtualThreadScheduler.parallelism=p
with p > 1
.
On Cloud Foundry we have 8 cores, and have not configured anything for selectors. Per client we will potentially have hundreds of concurrent connections, but certainly not thousands of concurrent connections.
@martinvisser do you have more than 1 HttpClient
in the same JVM?
I ask because having two of them would result in 1 having 4 selectors by default, and the other having another 4, and this would pin all the carriers.
This could explain why you have issues deploying on Cloud Foundry: maybe you deploy 2 different applications, but both use Jetty's HttpClient
.
If you have 100s of HTTP/1.1 connections, I suggest you explicitly configure the numbers of selectors to 1 in each HttpClient
instance.
We have 9 different RestClients, each with its own HttpClient. If all indeed would even pin 1 thread it would soon run out of carrier threads. There's no way around the pinning part?
There's no way around the pinning part?
No, unfortunately it is how virtual threads work.
Can you reuse the same HttpClient
instance across the RestClients?
Theoretically yes, but each has its own max pool size, timeouts, etc. Some have ssl, some don't.
I am afraid that if you need different configurations for each HttpClient
you have to manually set jdk.virtualThreadScheduler.parallelism=N
, with N carefully chosen.
Note that the "max pool size" should be irrelevant in your examples (you chose to be unlimited by using directly newVirtualThreadPerTaskExecutor()
) -- unless you refer to some other pool.
The TLS configuration needs to be separated only if you use client certificates, and they must be isolated from each other.
If you don't use client certificates, the TLS configuration can be the same, and if you perform clear-text http
requests (not https
), it won't be used.
Perhaps you can consolidate the timeouts into a single value?
If all of that is possible, then perhaps you can use a single HttpClient
with 1 selector and all will be good.
The backends (hubs as I called them) are so different that consolidating them all into one HttpClient is not suitable for our needs. Anyway, thanks for the insights. I think it does explain the issue we faced. Maybe Jetty Http client (with virtual threads that is) will be an option for us in the future, but for now it's unfortunately not.
Maybe Jetty Http client (with virtual threads that is) will be an option for us in the future, but for now it's unfortunately not.
Wait, this is true only for the "all virtual threads" solution you are trying.
If you use a Jetty QueuedThreadPool
with a virtual thread Executor
you won't have any of these problems.
The selectors will be in a platform thread, and your client application will be invoked in a virtual thread.
Best of both worlds.
You should totally be able to use Jetty's HttpClient
in your scenario, with the right configuration.
var qtp = new QueuedThreadPool();
// Configure qtp.
var vtp = new VirtualThreadPool();
// Configure vtp.
qtp.setVirtualThreadExecutor(vtp);
// One selector only.
var httpClient = new HttpClient(new HttpClientTransportOverHTTP(1));
httpClient.setExecutor(qtp);
httpClient.start();
Okay, I'll give that an attempt (tomorrow)
Out of interessi, have you tried just blocking code (no monos and fluxes) with native threads?
How parallel does your app need to be? You can often have many 10s of thousands of native threads without a problem.
@gregw Our app (GraphQL) consumes information from multiple hubs in parallel. This information is combined to build the response. Due to the I/O we have a lot of waiting, and having platform threads doing that is pretty wasteful.
@martinvisser Jetty's HttpClient
would wait for I/O threadlessly, so it would be totally equivalent to a virtual thread in that sense.
Not sure whether wrappers of Jetty's HttpClient
would block, but Jetty's HttpClient
does not.
Assuming the wrappers provide a similar asynchronous non-blocking API, there should be no problems using platform threads.
We have multiple customers that deployed a proxy-like solution in this way, with great scalability exactly because Jetty's HttpClient
waits for I/O threadlessly.
Did you manage to give QueuedThreadPool
+ VirtualThreadPool
(as shown above) a go?
Does it work for you?
@sbordet I tried this and yes, it works, but not really as I expect though. I expected to see and indeed saw one scheduler/selector per http client, but what I didn't expect to see is QueuedThreadPool
platform threads to grow based on the load. What I expected to happen is that there would be no need for that to grow due to the virtualTaskExecutor
handling all threads.
But perhaps my assumption is wrong.
@martinvisser we need details regarding your last comment.
How much did they grow?
The scheduler and the selectors are two different components that are not related to each other.
Can you take an HttpClient.dump()
and report it here?
Enabling JMX would be the simpler way to take a HttpClient.dump()
, see https://jetty.org/docs/jetty/12/programming-guide/arch/jmx.html
Alternatively, you can use a special request path (temporarily) that produces the dump()
string as response.
We are trying to be as friendly as possible towards virtual threads, so we're keen to collaborate with different use cases, because it seems that virtual threads bring to the table one unexpected surprise after another.
Per backend it starts out with 8 or so (not many at least) threads. At some point in our test we're reaching a breakpoint, which is kind of the goal of the test, but when that happens there is a steep increase in QueuedThreadPool
threads.
This picture includes the increase for 9 http clients, so the steepness is a bit unfair and is a factor 9 then.
The problem with this is that the application basically goes out of memory and Cloud Foundry kills the app. This means the container is gone and I can no longer generate a dump. I also have no option to increase memory due to constraints.
@martinvisser unfortunately that graph is not enough, as we don't know what is actually creating those threads -- could be some other component, as I understand this is the thread report for the whole JVM, not specifically for Jetty components.
Please set QueuedThreadPool.setDetailedDump(true)
, and take both a Server.dump()
and HttpClient.dump()
and we will know what the threads are doing and likely why they are increasing.
Getting a dump during that cascade failure would be good.... just from the point of view that we like to gracefully fail, so if that is Jetty's fault, then we'd like to see what can be done to stop that. Can you put a maxThreads limit on the QTP that is below the out of memory limit, so that you hit a plateau (probably non functional) long enough to get a dump? Alternately, how is cloud foundry killing the app? If it is done cleanly, you can set a dump on stop.
Jetty version(s)
12.0.12
Jetty Environment n/a
Java version/vendor
(use: java -version)
OS type/version MacOS
14.6.1
& Ubuntu22.04
Description We have a Spring Boot 3.3 application that uses GraphQL. That service is backed by several "hubs", each configured via a RestClient. RestClient is just an interface and can be used with several types of Http Client. The (near) default is Apache Http Client, but we noticed virtual thread pinning, so we are looking for alternatives. Spring has support, in order (see
org.springframework.boot.web.client.ClientHttpRequestFactories
), for Apache, Jetty, OkHttp 3 (deprecated), and Simple, so it made sense to look into Jetty. By using the defaults (only usingJettyClientHttpRequestFactory()
) the application works fine and keeps working fine, but the Jetty client uses a thread pool (using platform threads?) even though we would like to use virtual threads for the interaction with the hubs. By working fine I mean: the app returns a 200 OK on calling http://localhost:8080/actuator/health.To use virtual threads, we configure the
executor
on the HttpClient:When starting the application after this, nothing special happens, but when making one call to the RestClient a bunch of
ForkJoinPool
worker threads (no. 1-6 in the screenshot) are created. After keeping the application for idle, the workers (no. 7 and 8 in the screenshot) created by the application close, but the ones created by Jetty stay, and the app gets "stuck". By stuck I mean: no response is returned on calling the actuator.The different ForkJoinPools show different stack traces:
How to reproduce? I created a demo repo to show this behavior. The readme contains steps on how to run the app.
In short: after the app is started the RestClient bean is loaded and the ForkJoinPool worker threads are created. Depending on the amount of cores this starts a bunch for Jetty and 2 for the app (due to
-Djdk.virtualThreadScheduler.parallelism=2
). After keeping the app idle for about 60s (2 * idle timeout of 30s) those "app" worker threads close down, the other stay active. After those 60s the app is completely unresponsive.This breaks:
This works, but doesn't use virtual threads for the http client:
This is of course when
spring.threads.virtual.enabled
is set totrue
to make the whole application run virtual threads.