Closed gsmet closed 3 weeks ago
/cc @EricWittmann (openapi), @Ladicek (fault-tolerance,smallrye), @MikeEdgar (openapi), @brunobat (tracing), @cescoffier (mutiny,reactive-messaging), @ebullient (metrics), @jmartisk (graphql,health,metrics,smallrye), @jponge (mutiny), @ozangunalp (reactive-messaging), @phillip-kruger (graphql,openapi,smallrye), @radcortez (smallrye,tracing), @sberyozkin (jwt), @xstefank (health)
Hmmm, the bot shouldn't have mentioned so many people, sorry about that. We probably need to tweak this area/smallrye
label.
A quick workaround would be to initialize one static JsonProvider
somewhere (or at least to have only one for the client and one for the server if we want to support multiple deployments and tweaking the JsonProvider
but that means having the means to propagate it wherever it's needed) in the code and ban the use of Json.
.
Now, no idea if JSON-B uses Json
so that might just be a band aid.
Regarding the WildFly/EAP case and potential multiple providers: I generally gave up on supporting multiple deployments of SmallRye GraphQL in a single WF instance. It generally works, but there are lots of potential problems where the apps are not fully isolated. In the past, I made some efforts to make it work, but it was unwieldy, so I kinda gave up. This is in line with the official support policies of the EAP XP product (see https://access.redhat.com/articles/2026253) where multiple deployments are explicitly unsupported. So I guess we could start with the quick fix of caching the JSON provider and then look into bringing in support for Jackson...
@jmartisk if we are following this path, I wonder if we should make it static in Health too: https://github.com/smallrye/smallrye-health/blob/main/implementation/src/main/java/io/smallrye/health/SmallRyeHealthReporter.java#L125 .
AFAICS SmallRyeHealthReporter
is instantiated twice in Quarkus due to our CDI infra so we are initializing two JsonProvider
.
Probably worth discussing it with @xstefank and @radcortez .
Possibly, but given that it's an app-scoped CDI bean, it doesn't seem necessary (the problem would be obtaining the provider on each request), and having static fields in CDI beans is... weird
So I kinda agree but it's still one ServiceLoader call we could avoid. I will discuss it with Martin Kouba hopefully soon.
I was thinking about providing Jackson API in SR Health, but we need to be aligned with JSON-P for spec. But I personally don't see what's so wrong with a static field for this since it's really just utility for marshaling. @mkouba mentioning you so you chime in.
AFAICS
SmallRyeHealthReporter
is instantiated twice in Quarkus due to our CDI infra so we are initializing twoJsonProvider
.
It's instantiated twice because it's an @ApplicationScoped
bean so one instance is the client proxy which is a subclass of SmallRyeHealthReporter
. And the same applies to all those collections and maps, such as SmallRyeHealthReporter#livenessUnis
. We could initialize the state in the @PostConstruct
callback. For JsonProvider
I think it should be ok to use a static final field instead. However, we need to make sure the constant is initialized at the right time, i.e. when the correct provider is available.
Yes, the problem is that in the SR world, we have to support JSON-B / JSON-P because the MP platform requires it.
We could provide our own version of the JSON-B jar, with a different implementation. JBoss did that before, and Apache still does. It shouldn't require changes to the method signatures, so it should pass those. I'm not sure if there is a TCK test that checks if you get a different version of the provider on each call, but I find that unlikely.
So I'm not sure JSON-B itself is affected by this issue. I would expect them to have avoided it given it's well known on their side.
We need to be extremely cautious about using Json
in our code.
Also, while we know you need to support JSON-P/JSON-B, what we would like to see is an SPI to be able to push Jackson there and avoid having two JSON serialization implementation around in most cases.
Now, what I don't know is if it's feasible or if you have JSON-P/-B publicly exposed. We could probably be OK with JSON-P but it would really be nice if we could avoid having JSON-B in the classpath when using Jackson.
Now is it a priority? Is it too complex. I don't know :).
Also, while we know you need to support JSON-P/JSON-B, what we would like to see is an SPI to be able to push Jackson there and avoid having two JSON serialization implementation around in most cases.
Agree. Unfortunately, that also means that we would need to create yet another JSON Provider entry point so you could delegate to JSON-P or Jackson. It's not difficult, but we have to remember to use that one instead, and most likely, users would still call the original APIs.
The best option, at least for Quarkus, is to provide our own API jar with performance improvements since users need to reference the extension that they want to use directly, and we can easily control which API dependency to add. In traditional Jakarta Apps servers, it's trickier because users add the Jakarta API dependency from the official project.
Unfortunately, that also means that we would need to create yet another JSON Provider entry point so you could delegate to JSON-P or Jackson.
But are the users of SmallRye GraphQL/Health concerned by the JSON provider? That's why I was asking if it was leaking in the APIs?
I don't believe we have leaks in SR itself, but the Jakarta JSON APIs may be included transitively by the MP APIs. We need to check each one individually.
I wouldn't be too worried about having the jars around. There's a difference between having a jar around and initializing the library. We could improve on the latter and see how we can handle the former a bit later.
Anyway, for now, I would focus on making sure we don't instantiate too many providers as it's an easy win. The rest should be prioritized at an higher level probably.
I wouldn't be too worried about having the jars around.
I was thinking about the user calling that API with terrible performance.
Ah yes, that's their responsibility, really. I posted a message in the original issue there explaining how bad it really was with numbers. Because there's bad and there's really bad :).
we might want to have an SPI to push Jackson there. I discussed it a few times with @jmartisk already but it wasn't considered top priority
I am personally in favor of this being done sooner rather than later and willing to help make it happen.
I am personally in favor of this being done sooner rather than later and willing to help make it happen.
That's the spirit! You're certainly welcome to do it :)
Maybe this is one of the reasons why YASSON is so slow. It would be wrong to blame JSON-B for the bad performance of YASSON. JSON-B itself doesn't rely on the static methods in jakarta.json.Json
; you can pass an instance of a jakarta.json.spi.JsonProvider
into your jakarta.json.bind.JsonbBuilder
; otherwise it looks it up... once!
I'd like to hint at the QSON project by @patriot1burke and this discussion for this quarkus.io blog post by @mariofusco. They make Jackson even faster by moving the reflection part into the build process... which is really the Quarkus mindset. A big applause for that! QSON seems to not be very active ATM, but it could just require a little push (I love the name!)
I've started to experiment with a similar idea, but as an annotation processor and based on JSON-B APIs. (I haven't published it, yet, because it depends on a SNAPSHOT version of an annotation processing lib I develop in parallel, so it doesn't build OOTB.)
So, we actually always have three parts:
If I'm not mistaken, I think that way back when JSON-B was still young, there even was an initiative to make Jackson support the JSON-B APIs. But I haven't heard anything about that ever since. I have no idea about how much effort that would be; the Jackson code I've yet seen looks.... hmmm... highly optimized.
I would really love it, if application developers could stop worrying about such things. They should just pick an API and always get the best performance.
@t1 thanks for the input!
I would like to clarify a couple more reasons why myself (and I assume @gsmet) as much in favor of using Jackson everywhere possible:
My previous comment may have sounded too negative; actually I have nothing against Jackson! If I understand the article correctly, Jackson already allows to nicely separate the scanning from the actual (de)serializing. It would be nice, if it would also separate its API from the implementation. Then we could relatively simply add support for the JSON-B APIs for everything MP... or as an alternative, if a Quarkus developer chooses so. Under the hood, it could always be Jackson.
I think we can close this one now.
The low-level static method hosted in the Jakarta JSON-P
Json
object are very inefficient: they are resolving aJsonProvider.provider()
for each call which in turn calls theServiceLoader
(among other things...) and allocate a newJsonProvider
.See:
Meaning that when you are calling
Json.createValue("string")
to allocate a simple String value, you are actually making a call to the service loader and instantiating a newJsonProvider
.I stumbled upon that independently this week-end and after experimenting with it and writing a quick patch, I was like "it's crazy nobody complained about it" so I went to the JSON-P website and stumbled upon:
Meaning that I don't expect it to be fixed any time soon.
To give an idea of how things go, I did a simple JMH benchmark.
Json.createValue("test")
i.e. what people (including us) use:Using a static provider with
JSON_PROVIDER.createValue("test")
:That's 7200+ times faster. The
createValue()
methods are the ones for which it's the most inefficient as they should be extremely lightweight as called quite often.While we are mostly using Jackson in Quarkus itself and recommending Jackson, we have several components using JSON-P and JSON-B in the SmallRye world, for instance SmallRye GraphQL and SmallRye Health. And we have people using JSON-P/JSON-B out there because they want to stay in the MicroProfile or Jakarta world.
I know SmallRye GraphQL is affected by the inefficiencies as I have seen some direct calls to the
Json
methods in some runtime code (for instance here: https://github.com/smallrye/smallrye-graphql/blob/main/client/implementation/src/main/java/io/smallrye/graphql/client/impl/RequestImpl.java#L75). From what I can see SmallRye Health is not affected as it creates a uniqueJsonProvider
.Note that I haven't done any benchmarks in SmallRye GraphQL, I have just verified that we were instantiating too many
JsonProvider
s.As for the SmallRye components, I think we should discuss this subject in the context of Quarkus as the solution we want to put in place for Quarkus might be different from the ones we want to use in the general case (for instance in WildFly).
Json
shortcut under the hood, in which case it might make things a bit harder to handle.