quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.87k stars 2.71k forks source link

Weird results for virtual threads load test experiment + netty-loom-adaptor configuration #30598

Closed ozeil closed 10 months ago

ozeil commented 1 year ago

Hello everyone,

I've been experimenting with Quarkus' virtual thread integration and ran some benchmarks. I got some results which I can't make sense of which is why I'm asking for advice here.

I used wrk2 to fire requests to three endpoints running respectively on the event loop, a worker thread and a virtual thread with @RunOnVirtual Thread. The endpoints send a query to a Postgres DB and returns the result as JSON with Jackson similar to what @anavarr did here and here. The reactive event loop version uses the reactive Postgres client and so does the virtual thread version to avoid pinning carrier threads. The blocking version uses direct JDBC access through an javax.sql.DataSource injected like done here.

Implementation of the repository used to access the DB:

public class PetRepository {

    @Inject
    PgPool reactiveClient;

    @Inject
    DataSource jdbcDataSource;

    private static final String SELECT_ALL_PETS_QUERY = "SELECT * FROM pets";

    public List<Pet> findAllAwait() {
        List<Pet> pets = new ArrayList<>();
        RowSet<Row> rowSet = reactiveClient.preparedQuery(SELECT_ALL_PETS_QUERY).executeAndAwait();
        for (Row row : rowSet) {
            pets.add(from(row));
        }
        return pets;
    }

    public List<Pet> findAllBlocking() {
        List<Pet> pets = new ArrayList<>();
        try (Connection connection = jdbcDataSource.getConnection()) {
            try (PreparedStatement statement = connection.prepareStatement(SELECT_ALL_PETS_QUERY)) {
                var result = statement.executeQuery();
                while (result.next()) {
                    pets.add(new Pet(result.getLong("id"), result.getString("name")));
                }
            }
        } catch (SQLException e) {
            e.printStackTrace();
        }
        return pets;
    }

    public Uni<List<Pet>> findAllReactive() {
        return reactiveClient.preparedQuery(SELECT_ALL_PETS_QUERY).execute().map(rows -> {
            List<Pet> ret = new ArrayList<>(rows.size());
            for (Row row : rows) {
                ret.add(Pet.from(row));
            }
            return ret;
        });
    }
}

The database and the Quarkus application both run in Docker containers.

Now after running a few different loads with wrk2 I'm observing some weird results even when repeating the same command multiple times to warm up. Example wrk2 command: wrk -t 10 -c 1000 -d30s -R 5000 --latency http://localhost:8080/pets/blocking

For some reason the blocking version achieves the best performance, which I don't have an explanation for. At first, I thought this was due to the reactive connection pool size which defaults to fewer connections than the JDBC pool (20). However, after setting the reactive pool size also to 20 I'm still getting the same results. Everything runs on the same machine for now though so my results are most likely skewed. I constrained the container CPU and memory usage to try to make the application the bottleneck and not the database or the loader. Still, does someone see anything wrong with my code or the configuration which could explain this? The full project is located here: https://github.com/ozeil/quarkus-loom

Another issue I ran into is when I wanted to use quarkus-netty-loom-adaptor to increase the virtual thread performance. I'm not sure if I followed the instructions from @anavarr corectly. Have a look at my pom:

<?xml version="1.0"?>
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.oz.loomtest</groupId>
  <artifactId>quarkus-loom</artifactId>
  <version>1.0.0-SNAPSHOT</version>
  <properties>
    <compiler-plugin.version>3.10.1</compiler-plugin.version>
    <maven.compiler.source>19</maven.compiler.source>
    <maven.compiler.target>19</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    <quarkus.platform.artifact-id>quarkus-bom</quarkus.platform.artifact-id>
    <quarkus.platform.group-id>io.quarkus.platform</quarkus.platform.group-id>
    <quarkus.platform.version>2.15.2.Final</quarkus.platform.version>
    <skipITs>true</skipITs>
    <surefire-plugin.version>3.0.0-M7</surefire-plugin.version>
  </properties>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>${quarkus.platform.group-id}</groupId>
        <artifactId>${quarkus.platform.artifact-id}</artifactId>
        <version>${quarkus.platform.version}</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-resteasy-reactive</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-resteasy-reactive-jackson</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-reactive-pg-client</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-netty-loom-adaptor</artifactId>
      <version>2.15.3.Final</version>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-arc</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-agroal</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-jdbc-postgresql</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-container-image-jib</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-junit5</artifactId>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>io.rest-assured</groupId>
      <artifactId>rest-assured</artifactId>
      <scope>test</scope>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>${quarkus.platform.group-id}</groupId>
        <artifactId>quarkus-maven-plugin</artifactId>
        <version>${quarkus.platform.version}</version>
        <configuration>
          <source>${maven.compiler.source}</source>
          <target>${maven.compiler.target}</target>
          <!-- are these actually passed to javac??? -->
          <compilerArgs>
            <arg>--enable-preview</arg>
            <arg>-Dnet.bytebuddy.experimental</arg>
          </compilerArgs>
          <jvmArgs>--enable-preview -Djdk.tracePinnedThreads=full --add-opens java.base/java.lang=ALL-UNNAMED</jvmArgs>
        </configuration>
        <extensions>true</extensions>
        <executions>
          <execution>
            <goals>
              <goal>build</goal>
              <goal>generate-code</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

I'm not sure if the quarkus-maven-plugin passes the compiler arguments correctly and I'm not sure how to verify this. I see maven outputting [INFO] [io.quarkus.netty.loom.adaptor.NettyLoomAdaptorProcessor] Adapting Netty for Loom.... However, this is also the case when I comment out the compilerArgs. I also tried using maven-compiler-plugin and passing the two compiler arguments there directly like this:

<plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <version>3.8.1</version>
      <configuration>
        <compilerArgs>
          <arg>--enable-preview</arg>
          <arg>-Dnet.bytebuddy.experimental</arg>
        </compilerArgs>
      </configuration>
</plugin>

However, maven then fails with a fatal error because it doesn't recognize the option -Dnet.bytebuddy.experimental. By the way, is it at all necessary to pass --enable-preview to the compiler? In the Quarkus virtual thread guide this is only mentioned in the Concerning dev mode section.

Thank you!

EDIT:

Now reran on a proper setup with dedicated hosts for Quarkus app, loader and database. The hosts are located in the same data centre so the latency between all three is quite small. I'm still getting the best throughput and latencies for the blocking endpoint. Especially latency numbes are way worse with the reactive version. I also noticed much higher CPU usage on the database when the reactive version is ran.

quarkus-bot[bot] commented 1 year ago

/cc @geoand (jib)

geoand commented 1 year ago

cc @cescoffier @anavaro @franz1981

iqnev commented 1 year ago

Hi @ozeil I tried to build your project https://github.com/ozeil/quarkus-loom actually you use quay.io/quarkus/ubi-quarkus-mandrel-builder-image:22.3-java17 I am not sure but we don't have Mandrel with java 19 support yet image

I get the following build error: ./mvnw package -Pnative -Dquarkus.native.container-build=true

Fatal error: com.oracle.graal.pointsto.util.AnalysisError$ParsingError: Error encountered while parsing io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(int, int) 
Parsing context:
   at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java)
   at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
   at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
   at org.jboss.resteasy.reactive.server.vertx.ResteasyReactiveOutputStream.write(ResteasyReactiveOutputStream.java:217)
   at org.jboss.logmanager.handlers.TcpOutputStream.write(TcpOutputStream.java:185)
   at org.jboss.logmanager.handlers.TcpOutputStream.write(TcpOutputStream.java:176)
   at java.util.zip.GZIPOutputStream.writeHeader(GZIPOutputStream.java:191)
   at java.util.zip.GZIPOutputStream.<init>(GZIPOutputStream.java:97)
   at java.util.zip.GZIPOutputStream.<init>(GZIPOutputStream.java:134)

        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.util.AnalysisError.parsingError(AnalysisError.java:153)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlow.createFlowsGraph(MethodTypeFlow.java:104)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlow.ensureFlowsGraphCreated(MethodTypeFlow.java:83)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlow.getOrCreateMethodFlowsGraph(MethodTypeFlow.java:65)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.typestate.DefaultVirtualInvokeTypeFlow.onObservedUpdate(DefaultVirtualInvokeTypeFlow.java:109)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.TypeFlow.update(TypeFlow.java:562)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.PointsToAnalysis$1.run(PointsToAnalysis.java:488)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.util.CompletionExecutor.executeCommand(CompletionExecutor.java:193)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.util.CompletionExecutor.lambda$executeService$0(CompletionExecutor.java:177)
        at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: org.graalvm.compiler.java.BytecodeParser$BytecodeParserError: com.oracle.graal.pointsto.constraints.UnresolvedElementException: Discovered unresolved method during parsing: io.netty.util.internal.PlatformDependent.newUnsafeDirectByteBuf(io.netty.buffer.ByteBufAllocator, int, int). This error is reported at image build time because class io.netty.buffer.PooledByteBufAllocator is registered for linking at image build time by command line
        at parsing io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.throwParserError(BytecodeParser.java:2518)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.throwParserError(SharedGraphBuilderPhase.java:110)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.iterateBytecodesForBlock(BytecodeParser.java:3393)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.handleBytecodeBlock(BytecodeParser.java:3345)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.processBlock(BytecodeParser.java:3190)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.build(BytecodeParser.java:1138)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.buildRootMethod(BytecodeParser.java:1030)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.GraphBuilderPhase$Instance.run(GraphBuilderPhase.java:97)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase.run(SharedGraphBuilderPhase.java:84)
        at jdk.internal.vm.compiler/org.graalvm.compiler.phases.Phase.run(Phase.java:49)
        at jdk.internal.vm.compiler/org.graalvm.compiler.phases.BasePhase.apply(BasePhase.java:446)
        at jdk.internal.vm.compiler/org.graalvm.compiler.phases.Phase.apply(Phase.java:42)
        at jdk.internal.vm.compiler/org.graalvm.compiler.phases.Phase.apply(Phase.java:38)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.AnalysisParsedGraph.parseBytecode(AnalysisParsedGraph.java:135)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.meta.AnalysisMethod.ensureGraphParsed(AnalysisMethod.java:685)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlowBuilder.parse(MethodTypeFlowBuilder.java:171)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlowBuilder.apply(MethodTypeFlowBuilder.java:349)
        at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.flow.MethodTypeFlow.createFlowsGraph(MethodTypeFlow.java:93)
        ... 13 more
Caused by: com.oracle.graal.pointsto.constraints.UnresolvedElementException: Discovered unresolved method during parsing: io.netty.util.internal.PlatformDependent.newUnsafeDirectByteBuf(io.netty.buffer.ByteBufAllocator, int, int). This error is reported at image build time because class io.netty.buffer.PooledByteBufAllocator is registered for linking at image build time by command line
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.reportUnresolvedElement(SharedGraphBuilderPhase.java:333)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.handleUnresolvedMethod(SharedGraphBuilderPhase.java:323)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.phases.SharedGraphBuilderPhase$SharedBytecodeParser.handleUnresolvedInvoke(SharedGraphBuilderPhase.java:279)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.genInvokeStatic(BytecodeParser.java:1676)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.genInvokeStatic(BytecodeParser.java:1651)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.processBytecode(BytecodeParser.java:5288)
        at jdk.internal.vm.compiler/org.graalvm.compiler.java.BytecodeParser.iterateBytecodesForBlock(BytecodeParser.java:3385)
        ... 28 more
ozeil commented 1 year ago

Thank you, but maybe I should habe mentioned that I never tried to build a native executable with this project.

vszakd commented 11 months ago

@ozeil did you manage to discover more?

franz1981 commented 11 months ago

@vszakd related the performance behaviour, is not something unexpected, although profiling won't be trivial but helpful to verify what we already know, see https://github.com/async-profiler/async-profiler/issues/779

And https://github.com/FasterXML/jackson-core/issues/919 which still bite us, but is moving (@mariofusco is working on it)

Moreover, at the time of this issue, I have filled https://github.com/quarkusio/quarkus/issues/33102 which has been solved.

vszakd commented 10 months ago

Thanks, I guess there is still some road ahead for real-world VT exploitability in Quarkus, but the progress is exciting.

cescoffier commented 10 months ago

Closing - out of date

You do not need the adaptor anymore.