openzipkin / brave

Java distributed tracing implementation compatible with Zipkin backend services.
Apache License 2.0
2.36k stars 713 forks source link

VirtualThread pinning with RealSpan and zipkin2.reporter #1426

Closed keydon closed 5 months ago

keydon commented 6 months ago

Describe the Bug

I am using spring boot 3 with tracing through brave & zipkin. All fine until I tried out virtual threads with jdk21

Steps to Reproduce

Execute a spring boot 3 app with configured brave/zipkin tracing and set the JVM Flag -Djdk.tracePinnedThreads=full for detecting problematic pinning.

after a while your get the following stack-trace:

Thread[#239,ForkJoinPool-1-worker-17,5,CarrierThreads]
    java.base/java.lang.VirtualThread$VThreadContinuation.onPinned(Unknown Source)
    java.base/jdk.internal.vm.Continuation.onPinned0(Unknown Source)
    java.base/java.lang.VirtualThread.park(Unknown Source)
    java.base/java.lang.System$2.parkVirtualThread(Unknown Source)
    java.base/jdk.internal.misc.VirtualThreads.park(Unknown Source)
    java.base/java.util.concurrent.locks.LockSupport.park(Unknown Source)
    java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
    java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
    java.base/java.util.concurrent.locks.ReentrantLock$Sync.lock(Unknown Source)
    java.base/java.util.concurrent.locks.ReentrantLock.lock(Unknown Source)
    zipkin2.reporter.ByteBoundedQueue.offer(ByteBoundedQueue.java:51)
    zipkin2.reporter.AsyncReporter$BoundedAsyncReporter.report(AsyncReporter.java:259)
    zipkin2.reporter.brave.ConvertingSpanReporter.report(ConvertingSpanReporter.java:44)
    zipkin2.reporter.brave.ConvertingSpanReporter.report(ConvertingSpanReporter.java:29)
    zipkin2.reporter.brave.ZipkinSpanHandler.end(ZipkinSpanHandler.java:148)
    brave.internal.handler.NoopAwareSpanHandler$CompositeSpanHandler.end(NoopAwareSpanHandler.java:107)
    brave.internal.handler.NoopAwareSpanHandler.end(NoopAwareSpanHandler.java:57)
    brave.internal.recorder.PendingSpans.finish(PendingSpans.java:122)
    brave.RealSpan.finish(RealSpan.java:143) <== monitors:1
    brave.RealSpan.finish(RealSpan.java:138)
    io.micrometer.tracing.brave.bridge.BraveSpan.end(BraveSpan.java:117)
    io.micrometer.tracing.handler.TracingObservationHandler.endSpan(TracingObservationHandler.java:216)
    io.micrometer.tracing.handler.PropagatingReceiverTracingObservationHandler.onStop(PropagatingReceiverTracingObservationHandler.java:101)
    io.micrometer.tracing.handler.PropagatingReceiverTracingObservationHandler.onStop(PropagatingReceiverTracingObservationHandler.java:35)
    io.micrometer.observation.ObservationHandler$FirstMatchingCompositeObservationHandler.onStop(ObservationHandler.java:197)
    io.micrometer.observation.SimpleObservation.lambda$notifyOnObservationStopped$0(SimpleObservation.java:268)
    java.base/java.util.ArrayDeque$DescendingIterator.forEachRemaining(Unknown Source)
    io.micrometer.observation.SimpleObservation.notifyOnObservationStopped(SimpleObservation.java:268)
    io.micrometer.observation.SimpleObservation.stop(SimpleObservation.java:188)
    org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:123)
    org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
    org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:167)
    org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90)
    org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482)
    org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115)
    org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93)
    org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
    org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:735)
    org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:344)
    org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:391)
    org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
    org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:896)
    org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1744)
    org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
    java.base/java.lang.VirtualThread.run(Unknown Source)

the finish() Method of the RealSpan is marked as the culprit. I am no expert in multithreading and locking, but it seems that the "synchronized" keyword is the issue here and should be replaced with a virtual-thread friendlier way of locking, like ReentrantLock.

Expected Behaviour

no thread pinning should occur.

codefromthecrypt commented 6 months ago

So, you are suggesting that synchronized keyword in libraries is basically a problem in general now? I would be surprised if most code would work, if that's the case. Do you have any documentation that can help discuss this as we also need to consider old versions of Java (brave supports Java 6)

reta commented 6 months ago

@codefromthecrypt just to chime in, the official recommendation from Oracle [1] is to avoid synchronized in virtual thread flows and switch to ReentrantLock which is (to our luck) supported since JDK 1.5. I hope it helps.

PS: ready to help with the change

[1] https://spring.io/blog/2022/10/11/embracing-virtual-threads#mitigating-limitations [2] https://docs.oracle.com/en/java/javase/21/core/virtual-threads.html#GUID-704A716D-0662-4BC7-8C7F-66EE74B1EDAD

codefromthecrypt commented 6 months ago

@reta ok thanks for details offer to help (saves me from asking you ;)). There are a few places here and reporter that use synchronized. If you don't mind checking bench before and after (where they exist and are important).

reta commented 6 months ago

@reta ok thanks for details offer to help (saves me from asking you ;)). There are a few places here and reporter that use synchronized. If you don't mind checking bench before and after (where they exist and are important).

Sure, will do, thanks @codefromthecrypt

codefromthecrypt commented 6 months ago

Here's a TL'DR' from research I've done (and p.s. I'm done ;) )

This is a nice overview of the situation, https://todd.ginsberg.com/post/java/virtual-thread-pinning/ especially helpful is this quote:

An application that exhibits virtual thread pinning isn’t bad or incorrect, but it could be suboptimal. 

So, yes there can be pinning. Someone can decide this is critical and choose to write another handler. IMHO suboptimal is subjective and likely this is not the most suboptimal code in a typical spring boot stack. Given that context, we should really be careful about how to proceed and not introduce a cure worse than the disease.

A way that precisely breaks up the critical section is probably the best. A way that is just trying to get rid of synchronized everywhere is probably not best. more notes below.


Now, there is a lot of suboptimal code in a spring boot stack, and especially before micrometer 1.3 because for some reason when things moved from sleuth to the new system a couple things missed.

one is no more automatic use of brave's samplers like http (obviously you don't want to report 100pct of zero value things). This hasn't been fixed and I really don't expect them to.

the other is that under the SpanHandler.finish function, the span is copied into another format unnecessarily (zipkin core). This causes more overhead inside that synchronized block. This could have been addressed for years, but it wasn't until I did it recently. Should be out in 1.3 which you can try in our example now https://github.com/openzipkin/brave-example/tree/master/webflux6-micrometer

So, what's left? well my suggestion isn't to overreact. Instead look at the issue holistically. Folks interested this narrowly in performance may be unable to use most of the spring boot/micrometer setup at all. For example, if you'll measure the frontend examples vs armeria directly, there are millisecond grade differences. Yes, this is about concurrency, but also consider whatever is being done should be measured, and I bet there are other problems that will rise way ahead of this, such as stack depths hundreds deep due to reactive code.

Instead of overreacting, I suggest a very narrow approach. If the goal is to reduce pressure on the lock, an option that causes actually more overhead, but likely gets rid of this message can be done. I say an "option" because I think for a lot this is a cure worse than the disease.

So right now there is a lock on finish, let's simplify to this

      synchronized(mutableSpan) {
          // can take time in asyncreporter due to sizing to avoid OOM on backlog
          handler.end(mutableSpan);
      } 

So, if we assume this lock being somewhere in microsecond range is a problem, the problem is inside AsyncZipkinSpanHandler here..

https://github.com/openzipkin/zipkin-reporter-java/blob/master/brave/src/main/java/zipkin2/reporter/brave/AsyncZipkinSpanHandler.java#L212

which does some work here to prevent OOM, specifically encoder.sizeInBytes(next); https://github.com/openzipkin/zipkin-reporter-java/blob/master/core/src/main/java/zipkin2/reporter/internal/AsyncReporter.java#L219-L233

So, if we really have measured and think this is a problem etc etc. One thing not done is moving the OOM handling onto another thread. Basically re-use the AsyncReporter thread to execute report(S next) in another thread.

This will cause problems of course. One is that to do this safely, you must now copy/clone the mutable span. Also, the backpressure/blocking that exists now prevents OOM. By moving that to another thread, you can receive more spans into that sizer thread vs what could fit in memory under your rules. In other words, you could make it more likely for an OOM. To adjust for this, you may need to reduce your maxSize to adjust for the larger amount of in-flight (not yet sized) spans. In any case, the size heuristic isn't perfect anyway. Just know there is a side-effect.

Anyway, for someone really concerned about this log message, or they hit performance problems reduced to this critical section.. maybe it is a good tradeoff.

Regardless, zipkin-reporter has not had much love, and like this topic few actually working on tracing contribute. The async design there is very old and could be refactored in relationship to this issue and advances that have happened in the last >5 years

All that said.. some configuration to reduce the lock *in zipkin-reporter and when the correct code is in use (AsyncSpanHandler).. it could reduce the issue to something viable for another few years, and not rely on staff that don't exist.

I'll leave final decision to @reta and someone else to review his change, because I don't work on this project, this programming language, or even tracing. I hope the pointers help, though!

codefromthecrypt commented 6 months ago

heading out on holiday, here's a suggestion to move past this that will be easy to configure https://github.com/openzipkin/zipkin-reporter-java/issues/259

codefromthecrypt commented 5 months ago

https://github.com/openzipkin/zipkin-reporter-java/releases/tag/3.4.0 removes the blocking during report