Open rbygrave opened 2 years ago
The following is a JMH benchmark writing to json comparing:
PropertyStrTenTest.toJson_jsonb_jackson thrpt 1714.593 ops/ms
PropertyStrTenTest.toJson_jsonb_jakarta thrpt 992.326 ops/ms
PropertyStrTenTest.toJson_objectMapper thrpt 1602.596 ops/ms
I believe this benchmark is showing me that there are 2 performance techniques that jackson core makes available that are not available via json-p api. These techniques are used by ObjectMapper and also by avaje-jsonb (my library) when it uses jackson core.
The first technique is this issue, that the binding library (jackson objectmapper and avaje-jsonb) can provide a Jackson SerialisedString as the key instead of a plain String.
Adding this could help move the through put of PropertyStrTenTest.toJson_jsonb_jakarta
(parrson) towards PropertyStrTenTest.toJson_jsonb_jackson
(jackson-core). The comparison with PropertyStrTenTest.toJson_objectMapper
helps confirm where this sits relative to Jackson ObjectMapper.
The results with gc profiling:
PropertyStrTenTest.toJson_jsonb_jackson thrpt 1714.593 ops/ms
PropertyStrTenTest.toJson_jsonb_jackson:·gc.alloc.rate thrpt 1071.445 MB/sec
PropertyStrTenTest.toJson_jsonb_jackson:·gc.alloc.rate.norm thrpt 688.054 B/op
PropertyStrTenTest.toJson_jsonb_jackson:·gc.churn.G1_Eden_Space thrpt 1071.163 MB/sec
PropertyStrTenTest.toJson_jsonb_jackson:·gc.churn.G1_Eden_Space.norm thrpt 687.873 B/op
PropertyStrTenTest.toJson_jsonb_jackson:·gc.churn.G1_Survivor_Space thrpt 0.002 MB/sec
PropertyStrTenTest.toJson_jsonb_jackson:·gc.churn.G1_Survivor_Space.norm thrpt 0.001 B/op
PropertyStrTenTest.toJson_jsonb_jackson:·gc.count thrpt 38.000 counts
PropertyStrTenTest.toJson_jsonb_jackson:·gc.time thrpt 19.000 ms
PropertyStrTenTest.toJson_jsonb_jakarta thrpt 992.326 ops/ms
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.alloc.rate thrpt 1391.629 MB/sec
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.alloc.rate.norm thrpt 1544.121 B/op
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.churn.G1_Eden_Space thrpt 1409.426 MB/sec
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.churn.G1_Eden_Space.norm thrpt 1563.869 B/op
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.churn.G1_Survivor_Space thrpt 0.004 MB/sec
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.churn.G1_Survivor_Space.norm thrpt 0.004 B/op
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.count thrpt 50.000 counts
PropertyStrTenTest.toJson_jsonb_jakarta:·gc.time thrpt 34.000 ms
PropertyStrTenTest.toJson_objectMapper thrpt 1602.596 ops/ms
PropertyStrTenTest.toJson_objectMapper:·gc.alloc.rate thrpt 1013.111 MB/sec
PropertyStrTenTest.toJson_objectMapper:·gc.alloc.rate.norm thrpt 696.055 B/op
PropertyStrTenTest.toJson_objectMapper:·gc.churn.G1_Eden_Space thrpt 1014.789 MB/sec
PropertyStrTenTest.toJson_objectMapper:·gc.churn.G1_Eden_Space.norm thrpt 697.208 B/op
PropertyStrTenTest.toJson_objectMapper:·gc.churn.G1_Survivor_Space thrpt 0.002 MB/sec
PropertyStrTenTest.toJson_objectMapper:·gc.churn.G1_Survivor_Space.norm thrpt 0.001 B/op
PropertyStrTenTest.toJson_objectMapper:·gc.count thrpt 36.000 counts
PropertyStrTenTest.toJson_objectMapper:·gc.time thrpt 18.000 ms
Given the 2 models:
package org.example.jmh.model;
import io.avaje.jsonb.Json;
@Json
public record NarrowNamesRecord(
String a,
String b,
String c,
String d,
String e
) {
}
package org.example.jmh.model;
import io.avaje.jsonb.Json;
@Json
public record WideNamesRecord(
String firstNameProperty1,
String lastNameProperty2,
String anotherSimilarProperty3,
String moreOrLessProperty4,
String lastButNotLeastProperty5
) {
}
The following benchmark compares the relative throughput when comparing models with relatively narrow keys vs wider keys. We expect the benefit from the use of SerializedString to be more pronounced when the keys are relatively wider as the generator is able to skip more encoding of keys.
Benchmark Mode Cnt Score Error Units
RecordBasicTest.toJson_narrowNames_jsonb_jackson thrpt 2327.835 ops/ms
RecordBasicTest.toJson_narrowNames_jsonb_jakarta thrpt 1425.564 ops/ms
RecordBasicTest.toJson_narrowNames_objectMapper thrpt 2268.823 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jackson thrpt 3585.764 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta thrpt 1749.537 ops/ms
RecordBasicTest.toJson_wideNames_objectMapper thrpt 3069.284 ops/ms
Note the test data for this is:
testDataWideNames = new WideNamesRecord("1", "2", "3", "4", "5");
testDataNarrowNames = new NarrowNamesRecord("property1property1property1property1", "property2property2property2property2", "property3property3property3property3", "property4property4property4property4", "property5property5property5property5");
With a local prototype I can see a 17% through put improvement with the "prepared keys" in the wide key case. We see that in the 1800 to 2112 values below.
RecordBasicTest.toJson_wideNames_jsonb_jackson thrpt 3694.105 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_normalKey thrpt 1800.453 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_preparedKey thrpt 2112.591 ops/ms
RecordBasicTest.toJson_wideNames_objectMapper thrpt 3352.952 ops/ms
With the narrow key case we expect very little difference, we want to make sure we don't get degraded performance and we see that:
RecordBasicTest.toJson_narrowNames_jsonb_jackson thrpt 2420.908 ops/ms
RecordBasicTest.toJson_narrowNames_jsonb_jakarta_normalKey thrpt 1443.192 ops/ms
RecordBasicTest.toJson_narrowNames_jsonb_jakarta_preparedKey thrpt 1447.498 ops/ms
RecordBasicTest.toJson_narrowNames_objectMapper thrpt 2263.234 ops/ms
This is still short of what avaje-jsonb is getting via using jackson-core, and short of jackson objectmapper.
With a local prototype, getting closer to the performance of jackson core. Note this has some additional performance adjustments to parsson internals as well (like removing Scope.IN_FIELD to reduce GC pressure).
Benchmark Mode Cnt Score Error Units
RecordBasicTest.toJson_wideNames_jsonb_jackson thrpt 3568.656 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_normalKey thrpt 2334.342 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_preparedKey thrpt 2884.539 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_preparedKey2 thrpt 3274.152 ops/ms
RecordBasicTest.toJson_wideNames_objectMapper thrpt 3551.022 ops/ms
hm, I'm a bit confused as where does this belong to. Based on the prefix [api] I would think this belongs here but as you are referring to parsson, that would belong to the implementation. Does this need some change in jsonp-api? If not, I would move this back to parsson project...
Does this need some change in jsonp-api?
Yes it does and yes I think it is fine that the issue is here. Ultimately as I see it we will need to change parsson / the reference implementation as well but it's all good having this here.
I have made this a bit confusing. This is pretty much a performance orientated feature. As such I am looking to prove the concept somewhat before I push a PR for the api change (hence the bench marking etc). So yes, I'm making implementation changes at the moment to parsson so that I can:
What we probably could do at some point is create a related issue in parsson but I am happy to progress my work a little more first.
Ok, I think I've done enough initial bench marking to push for this API change.
With respect to this API change we are looking at comparing the _jakartanormalKey and _jakartapreparedKey throughput. We see better throughput can be achieved by using keys that are "prepared" (already escaped and encoded).
These benchmarks include a baseline "jsonb_jackson" which is avaje-jsonb using Jackson core 12.13.0, and a baseline "objectMapper" which is Jackson ObjectMapper.
Note: Using a modified version of parsson for the jakarta implementation. Changes include using ThreadLocal based buffer recycling
4 threads,
Benchmark Mode Cnt Score Error Units
RecordBasicTest.toJson_wideNames_jsonb_jackson thrpt 2 12847.776 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_normalKey thrpt 2 12063.079 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_preparedKey thrpt 2 18269.082 ops/ms
RecordBasicTest.toJson_wideNames_objectMapper thrpt 2 11494.361 ops/ms
2 threads
Benchmark Mode Cnt Score Error Units
RecordBasicTest.toJson_wideNames_jsonb_jackson thrpt 4 6683.975 ± 3575.546 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_normalKey thrpt 4 6882.481 ± 827.704 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_preparedKey thrpt 4 10105.276 ± 1312.702 ops/ms
RecordBasicTest.toJson_wideNames_objectMapper thrpt 4 6577.431 ± 784.614 ops/ms
1 thread
Benchmark Mode Cnt Score Error Units
RecordBasicTest.toJson_wideNames_jsonb_jackson thrpt 4 3608.207 ± 525.817 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_normalKey thrpt 4 3471.075 ± 396.187 ops/ms
RecordBasicTest.toJson_wideNames_jsonb_jakarta_preparedKey thrpt 4 5248.490 ± 682.141 ops/ms
RecordBasicTest.toJson_wideNames_objectMapper thrpt 4 3261.331 ± 272.124 ops/ms
Related PR: https://github.com/eclipse-ee4j/jsonp/issues/344 Related PR in Parsson - https://github.com/eclipse-ee4j/parsson/pull/47
Summary
Create a JsonGenerator.Key
Use a JsonGenerator.Key
I'd like to discuss adding the ability for JsonGenerator to support "pre-encoded" keys as a performance optimisation along the lines of Jacksons SerializedString .
That is, Jackson JsonGenerator has writeFieldName(SerializableString key) ... and I'd love to see that added to the jakarta json api.
SerializedString is the implementation of SerializableString which has pre-encoded the String value into bytes.
This allows the underlying generators to effectively do array copy of the already encoded bytes. https://github.com/FasterXML/jackson-core/blob/2.14/src/main/java/com/fasterxml/jackson/core/json/UTF8JsonGenerator.java#L282
Background:
I have been doing JMH performance benchmarks around this area and with jackson core, the use of SerializableString for the "field names" aka "keys" is significant - how significant depends on the relative width of the keys vs values. For relatively wide keys and narrow values it is really significant.
More background, I'm working on avaje-jsonb which uses APT source code generation to do json binding. avaje-jsonb currently abstracts over jackson code and now jakarta json (so I'm doing direct comparison between parsson and jackson-core in this manor). Currently this is an important performance feature that I see with the jackson core adapter.