Open jrudolph opened 2 years ago
I'm not completely sure that something should be done here.
It seems consistent to avoid running user code inside of the connection stream and steer away from using subFusingMaterializer
in these cases in the same way as we also run the handler passed to bind
in a Future
automatically to avoid stream contention. The extra cost compared to the overall cost of stream materialization in these cases might just be ok.
On the other hand, changing this behavior will add extra work that cannot be avoided by the user any more (unless we introduce another flag).
If we leave as is I think it would be good to at least mention something around this in the docs as from the user API it is not obvious that what looks like separate invocations are not actually parallel.
H2ServerProcessingBenchmark
seems to show a big difference (though the magnitude seems weird to me...):
jmh:run -wi 5 -w 3 -i 4 -r 5 -f 1 -p requestbody=empty -p responsetype=closedelimited H2Server
With subFusingMaterializer
:
[info] H2ServerProcessingBenchmark.benchRequestProcessing 1 empty closedelimited thrpt 4 68145.489 ± 954.890 ops/s
With materializer
:
[info] Benchmark (minStrictEntitySize) (requestbody) (responsetype) Mode Cnt Score Error Units
[info] H2ServerProcessingBenchmark.benchRequestProcessing 1 empty closedelimited thrpt 4 50164.835 ± 6151.670 ops/s
H2ServerProcessingBenchmark
seems to show a big difference (though the magnitude seems weird to me...):
Looking at the flamegraphs it looks legit. Materialization is expensive but for small stream graphs, creating and tearing down the actors is the most expensive part of materialization (we knew this before).
The benchmark does not include the network stack, so, of course, the impact is scaled a lot when taking that into account as well.
In summary, not super nice to change it without a way to configure it, since we spent so much time optimizing many code paths...
In helidon nima,every substream is run with a dedicated Virtual Thread.
As observed in https://github.com/lightbend/kalix-jvm-sdk/issues/1078, HTTP/2 substream Sinks materialized by the HTTP/2 infrastructure are run with the
subFusingMaterializer
so that they run with the same stream infrastructure as the main HTTP/2 connection infrastructure:https://github.com/akka/akka-http/blob/2d1b8727d74d2332de294da3d0cfeba40e12bdcb/akka-http-core/src/main/scala/akka/http/impl/engine/http2/Http2StreamHandling.scala#L644
This can lead to 1) starvation if one of the substreams does CPU-intensive work (or even sleeps) inside of the stream 2) limits parallelization between concurrent substreams
A well-behaved streaming application does not run CPU-intensive (or blocking) payloads directly on the stream, so in many cases this will not become a problem.
There might be scenarios, however, where most traffic to a server might arrive on a single connection (e.g. behind a load balancer) with many expected concurrent streams. In that case, running all the substreams together in the same stream together with the connection infrastructure might be too much.
Note, how that applies mostly to streaming requests/responses. Requests that can be handed out with Strict entities (when collecting the full entity data by enabling
min-collect-strict-entity-size
was successful) and strict responses are not affected by this issue.