alibaba / fastjson2

🚄 FASTJSON2 is a Java JSON library with excellent performance.
Apache License 2.0
3.59k stars 471 forks source link

[FEATURE] ThreadLocal handling with Virtual Threads (JDK21) #2319

Open zekronium opened 3 months ago

zekronium commented 3 months ago

请描述您的需求或者改进建议

The project heavily relies on ThreadLocal objects, which can cause explosive memory growth with Virtual Threads. In theory, ThreadLocals have relatively no effect with Virtual Threads since its encouraged to create many and short lived threads, which would wipe the Thread Local objects. Also, there can exist millions of virtual threads, where ThreadLocal also can become a problem.

请描述你建议的实现方案

Utilise a different mechanism of recycling such objects. Currently Jackson has created a virtual thread friendly recycler.

wenshao commented 3 months ago

fastjson2 does not rely heavily on ThreadLocal. If you find that it does, please mention it specifically. The performance of fastjson2 under JDK 21 is also very good, and there is no problem.

juliojgd commented 1 month ago

@wenshao Please reopen as it seems it DOES rely on ThreadLocal's:

https://github.com/alibaba/fastjson2/blob/bd8583e923df259cd2905dc9a90022da813cbd2d/core/src/main/java/com/alibaba/fastjson2/JSONFactory.java#L379-L383

And with high concurrency we observed a clear degradation in performance (worse than Jackson in those cases).

wenshao commented 1 month ago

These ThreadLocals will only be used when a JavaBean is serialized or deserialized for the first time and will not affect performance.

https://github.com/alibaba/fastjson2/blob/main/docs/benchmark/benchmark_2.0.50.md

The benchmark for each release is run with 16 threads, and the performance is much better than Jackson. If you find any bad scenarios, please help provide them.

juliojgd commented 1 month ago

We were testing JMH (making the needed customizations to use FastJson2) and our JMH test figures with low concurrency showed that FastJson2 is more performant in those cases. An example:

JMH test configuration:

return optBuilder
        .forks(1)
        .threads(3)
        .warmupIterations(1)
        .measurementIterations(3)
        .measurementTime(TimeValue.seconds(2))
        .mode(Mode.Throughput)
        .addProfiler(GCProfiler.class);

image

BUT, when high concurrency is used, the only change is yo change thread number to 100:

return optBuilder
        .forks(1)
        .threads(100)
        .warmupIterations(1)
        .measurementIterations(3)
        .measurementTime(TimeValue.seconds(2))
        .mode(Mode.Throughput)
        .addProfiler(GCProfiler.class);

image

Clearly FastJson2 degrades its performance under Jackson.The more degraded behavior is with small payloads.

EDIT: The test is made with platform (classic) threads, but the same (if not worse) performance degradation with high concurrency is seen also when using Java 21 Virtual Threads.

EDIT 2: I removed the reference to Spring Boot as we managed to create a JMH test without using SB, same results.

Regards,

wenshao commented 1 month ago

I used 100 threads to run benchmark EishayWriteUTF8Bytes, and fastjson2 also has better performance than jackson.

https://github.com/alibaba/fastjson2/blob/main/benchmark/src/main/java/com/alibaba/fastjson2/benchmark/eishay/EishayWriteUTF8Bytes.java

    public static void main(String[] args) throws RunnerException {
        Options options = new OptionsBuilder()
                .include(EishayWriteUTF8Bytes.class.getName())
                .exclude(EishayWriteUTF8BytesTree.class.getName())
                .mode(Mode.Throughput)
                .timeUnit(TimeUnit.MILLISECONDS)
                .warmupIterations(3)
                .forks(1)
                .threads(100)
                .build();
        new Runner(options).run();
    }
juliojgd commented 1 month ago

Hello again @wenshao it seems your test executes writing (JSON.toJSONBytes(... - serialization) and our test executes parsing (JSON.parseObject... - deserialization). I will try to run both cases and report back results.

wenshao commented 1 month ago

I ran the deserialization performance test with 100 threads, and the performance of fastjson2 is still better than jackson. Can you show your test code?

Code

https://github.com/alibaba/fastjson2/blob/main/benchmark/src/main/java/com/alibaba/fastjson2/benchmark/eishay/EishayParseUTF8Bytes.java

    public static void main(String[] args) throws RunnerException {
        Options options = new OptionsBuilder()
                .include(EishayParseUTF8Bytes.class.getName())
                .exclude(EishayParseUTF8BytesPretty.class.getName())
                .mode(Mode.Throughput)
                .timeUnit(TimeUnit.MILLISECONDS)
                .warmupIterations(3)
                .forks(1)
                .threads(100)
                .build();
        new Runner(options).run();
    }

Result

Benchmark                        Mode  Cnt      Score     Error   Units
EishayParseUTF8Bytes.fastjson2  thrpt    5  16319.231 ±  93.431  ops/ms
EishayParseUTF8Bytes.jackson    thrpt    5   5865.542 ± 107.066  ops/ms
juliojgd commented 1 month ago

Hello @wenshao I've added writing tests and re-tested our existing with 1, 16 and 100 platform and VT-threads and the results are better with FastJson2, so you were right. FastJson2 tests are faster in every test case.

We are wondering why the first runs returned so different results.

Sorry for the confusion.

zekronium commented 1 month ago

It still seems like its a low hanging fruit to not optimize for that. Jackson has implemented a way to handle this with virtual threads. It is possible to create a scenario when deserializing many different objects on short lived virtual threads to create explosive memory growth.