Closed dzonekl closed 6 years ago
You need to configure the executor pools used by futures to be instrumented so that they propagate the span to new threads.
I believe the ‘thenCompose’ is called from the same background thread the span was propagated to but in a different completionstage where the new scope for the propagated span was already closed.
This problem could possibly be solved by using the ContextAwareCompletableFuture from my library in combination with the OpenTracing support jar.
See https://github.com/talsma-ict/context-propagation for more details. I believe there is also an issue filed (and mostly fixed) about the CompletionStage problem of CompletableFuture there.
Please ping me for further details.. (sorry, on my phone now)
Met vriendelijke groet, Sjoerd Talsma
Op 25 okt. 2018 om 22:46 heeft Yuri Shkuro notifications@github.com het volgende geschreven:
You need to configure the executor pools used by futures to be instrumented so that they propagate the span to new threads.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Ah, here it is:
CompletionStage
propagation issue: https://github.com/talsma-ict/context-propagation/issues/51Thank You, @sjoerdtalsma @yurishkuro , An underlying library produces the futures, of which I have no immediate control. Out of curiosity will ContextAwareCompletableFuture end up in the opentracing code ? What I need is a practical solution, What I am trying now is to use a custom ScopeManager, which persists in the described case.
@dzonekl If you have no control over the CompletableFuture
you got, you could make sure that the CompletableFuture
you thenCompose
with, is produced by an ExecutorService
threadpool that is context aware (like @yurishkuro suggested)
Out of curiosity will ContextAwareCompletableFuture end up in the opentracing code
I currently have no plans to do so; they serve different purposes: OpenTracing provides an API, my context-propagation
is a library that propagates (all kinds of) contexts.
...you could make sure that the
CompletableFuture
youthenCompose
with, is produced by anExecutorService
threadpool that is context aware (like @yurishkuro suggested)
you mean ?
thenComposeAsync(
Function<? super T, ? extends CompletionStage<U>> fn,
Executor executor)
mmmh, my goal is to make this work without having to touch the calling code of this library, async flows are already hard to get cleanly readable. I guess my best bet is to actually instrument the library properly, or use a custom ScopeManager.
Thanks for your help.
In order to understand what is happening, I created a custom scope manager, which logs what is happening [1]. Prior to thread ault-executor-3
, being started, we see the scope being unset (toRestore == null), done by the TracingHandlerInterceptor#afterConcurrentHandlingStarted, later on the original traceid is restored. Why is the active scope closed when the handler starts an async request? What is the idea?
[1] New traceId, when doing Async request.
2018-10-28 09:45:58.615 INFO 47257 --- [nio-8080-exec-1] e.b.e.config.AxonConfig$FutureScope : New scope
2018-10-28 09:45:58.615 INFO 47257 --- [nio-8080-exec-1] b.e.c.AxonConfig$OpenApiApplicationScope : Setting scope: FutureScope{, wrapped=fba08ffe035cf5f3:fba08ffe035cf5f3:0:1 - POST, finishOnClose=false, toRestore=nothing to restore}
2018-10-28 09:45:58.617 INFO 47257 --- [nio-8080-exec-1] e.b.e.a.rest.AssetDefinitionController : Creating asset definition: class Token {
class Asset {
symbol: KLH
longName: x coin
precision: 8
minimumQuantity: 0.01
active: true
}
}
2018-10-28 09:45:58.617 INFO 47257 --- [nio-8080-exec-1] e.b.e.config.AxonConfig$FutureScope : New scope
2018-10-28 09:45:58.617 INFO 47257 --- [nio-8080-exec-1] b.e.c.AxonConfig$OpenApiApplicationScope : Setting scope: FutureScope{, wrapped=fba08ffe035cf5f3:cd7d093fb5ae1814:fba08ffe035cf5f3:1 - query, finishOnClose=true, toRestore=fba08ffe035cf5f3:fba08ffe035cf5f3:0:1 - createAssetDefinition}
2018-10-28 09:45:58.618 INFO 47257 --- [nio-8080-exec-1] i.j.internal.reporters.LoggingReporter : Span reported: fba08ffe035cf5f3:cd7d093fb5ae1814:fba08ffe035cf5f3:1 - query
2018-10-28 09:45:58.619 INFO 47257 --- [nio-8080-exec-1] e.b.e.config.AxonConfig$FutureScope : Restoring scope
2018-10-28 09:45:58.619 INFO 47257 --- [nio-8080-exec-1] b.e.c.AxonConfig$OpenApiApplicationScope : Setting scope: FutureScope{, wrapped=fba08ffe035cf5f3:fba08ffe035cf5f3:0:1 - createAssetDefinition, finishOnClose=false, toRestore=nothing to restore}
2018-10-28 09:45:58.619 INFO 47257 --- [nio-8080-exec-1] e.b.e.config.AxonConfig$FutureScope : Restoring scope
2018-10-28 09:45:58.619 INFO 47257 --- [nio-8080-exec-1] b.e.c.AxonConfig$OpenApiApplicationScope : Unsetting scope
2018-10-28 09:45:58.628 INFO 47257 --- [ault-executor-3] e.b.e.config.AxonConfig$FutureScope : New scope
2018-10-28 09:45:58.628 INFO 47257 --- [ault-executor-3] b.e.c.AxonConfig$OpenApiApplicationScope : Setting scope: FutureScope{, wrapped=9c3dd7f238513fdc:9c3dd7f238513fdc:0:1 - command, finishOnClose=true, toRestore=nothing to restore}
2018-10-28 09:45:58.629 INFO 47257 --- [ault-executor-3] i.j.internal.reporters.LoggingReporter : Span reported: 9c3dd7f238513fdc:9c3dd7f238513fdc:0:1 - command
2018-10-28 09:45:58.629 INFO 47257 --- [ault-executor-3] e.b.e.config.AxonConfig$FutureScope : Restoring scope
2018-10-28 09:45:58.629 INFO 47257 --- [ault-executor-3] b.e.c.AxonConfig$OpenApiApplicationScope : Unsetting scope
2018-10-28 09:45:58.661 INFO 47257 --- [nio-8080-exec-2] e.b.e.config.AxonConfig$FutureScope : New scope
2018-10-28 09:45:58.661 INFO 47257 --- [nio-8080-exec-2] b.e.c.AxonConfig$OpenApiApplicationScope : Setting scope: FutureScope{, wrapped=fba08ffe035cf5f3:fba08ffe035cf5f3:0:1 - createAssetDefinition, finishOnClose=false, toRestore=nothing to restore}
2018-10-28 09:45:58.662 INFO 47257 --- [nio-8080-exec-2] e.b.e.config.AxonConfig$FutureScope : Restoring scope
2018-10-28 09:45:58.662 INFO 47257 --- [nio-8080-exec-2] b.e.c.AxonConfig$OpenApiApplicationScope : Unsetting scope
2018-10-28 09:45:58.662 INFO 47257 --- [nio-8080-exec-2] e.b.e.config.AxonConfig$FutureScope : New scope
2018-10-28 09:45:58.662 INFO 47257 --- [nio-8080-exec-2] b.e.c.AxonConfig$OpenApiApplicationScope : Setting scope: FutureScope{, wrapped=fba08ffe035cf5f3:fba08ffe035cf5f3:0:1 - createAssetDefinition, finishOnClose=true, toRestore=nothing to restore}
2018-10-28 09:45:58.663 INFO 47257 --- [nio-8080-exec-2] i.j.internal.reporters.LoggingReporter : Span reported: fba08ffe035cf5f3:fba08ffe035cf5f3:0:1 - createAssetDefinition
2018-10-28 09:45:58.663 INFO 47257 --- [nio-8080-exec-2] e.b.e.config.AxonConfig$FutureScope : Restoring scope
2018-10-28 09:45:58.663 INFO 47257 --- [nio-8080-exec-2] b.e.c.AxonConfig$OpenApiApplicationScope : Unsetting scope
Hey guys, I digged further and stumbled upon opentracing, spring-cloud-core async support, which seems to deal with the same issue:
Hey guys, I digged further and stumbled upon opentracing, spring-cloud-core async support, which seems to deal with the same issue:
..deal with, or suffers from the same issue?
I doubt the callbacks in the ListenableFuture are running with the correct active span..
@sjoerdtalsma are you saying the Async, code is not working correctly?
@sjoerdtalsma are you saying the Async, code is not working correctly?
I'm by no means a Spring expert, but what I gather is that a ListenableFuture
is similar in concept to a CompletableFuture
(probably in Spring before Java 8 existed). In that sense the callbacks might provide similar behaviour as the CompletionStage
concept from the CompletableFuture
(the then... methods).
The span is propagated into the 'main' execution body of the original Future, but I don't know where the callback is called from. If it's from 'within' the main execution body (don't see how, but ok) all is well. Otherwise, the code in the ListenableFuture callback will most likely not be aware of the active span. You'd have to test that though.
This becomes very complex very quick and it's hard to reason about things without concrete test cases.
We haven't even touched upon undetermined codepaths like A.runAfterEither(B, Runnable)
--> which active span should be active in the runnable, the active span from A
or from B
?
@sjoerdtalsma I tried your context library [1], with logging on.
The executor is configured to create threads prefixed with trace-d
, from the log I can see the ContextAwareExecutorService in action, and the snapshots being reactivated. Unfortunately the 2nd thread created, has a different traceid reported, so it seems the propagation didn't work.
I pass on this executor, to regular CompletableFuture.then..Async(...)
methods.
Could it be the GlobalTracer needs to be configured for this to work, or am I not using the library properly? (I don't use your completablefuture , as these are produced by the library I call...).
[1]
2018-11-07 17:15:04.808 INFO 27441 --- [nio-8080-exec-1] e.b.e.a.rest.AssetDefinitionController : Creating asset definition: class Token { class Asset { symbol: MVB longName: x coin precision: 8 minimumQuantity: 0.01 maximumQuantity: 2 active: true } } 2018-11-07 17:15:04.818 TRACE 27441 --- [nio-8080-exec-1] n.t.context.ContextManagers : Active context of nl.talsmasoftware.context.opentracing.SpanManager@425bcbb4 added to new snapshot: nl.talsmasoftware.context.opentracing.SpanManager$ScopeContext@7c91894e. 2018-11-07 17:15:04.822 TRACE 27441 --- [nio-8080-exec-1] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.opentracing.SpanManager.getActiveContext: 693,490ns 2018-11-07 17:15:04.823 TRACE 27441 --- [nio-8080-exec-1] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.ContextManagers.createContextSnapshot: 7,080,102ns 2018-11-07 17:15:04.825 TRACE 27441 --- [ traced-1] n.t.context.ContextManagers : Context reactivated from snapshot by nl.talsmasoftware.context.opentracing.SpanManager@4fe3514b: nl.talsmasoftware.context.opentracing.SpanManager$ScopeContext@ae1c6d3. 2018-11-07 17:15:04.826 TRACE 27441 --- [ traced-1] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.opentracing.SpanManager.initializeNewContext: 994,711ns 2018-11-07 17:15:04.827 TRACE 27441 --- [ traced-1] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.ContextSnapshot.reactivate: 2,925,039ns 2018-11-07 17:15:05.834 INFO 27441 --- [ traced-1] i.j.internal.reporters.LoggingReporter : Span reported: 84a0ac2cb1d3159e:47245443fc1580a7:84a0ac2cb1d3159e:1 - query 2018-11-07 17:15:05.839 TRACE 27441 --- [ traced-1] n.t.context.delegation.RunnableAdapter : Call result ignored by RunnableAdapter: null 2018-11-07 17:15:05.863 TRACE 27441 --- [ault-executor-0] n.t.context.ContextManagers : Active context of nl.talsmasoftware.context.opentracing.SpanManager@79ed8aa7 added to new snapshot: nl.talsmasoftware.context.opentracing.SpanManager$ScopeContext@1a335116. 2018-11-07 17:15:05.863 TRACE 27441 --- [ault-executor-0] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.opentracing.SpanManager.getActiveContext: 159,219ns 2018-11-07 17:15:05.863 TRACE 27441 --- [ault-executor-0] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.ContextManagers.createContextSnapshot: 842,939ns 2018-11-07 17:15:05.864 TRACE 27441 --- [ traced-2] n.t.context.ContextManagers : Context reactivated from snapshot by nl.talsmasoftware.context.opentracing.SpanManager@6701fd74: nl.talsmasoftware.context.opentracing.SpanManager$ScopeContext@22037ff0. 2018-11-07 17:15:05.864 TRACE 27441 --- [ traced-2] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.opentracing.SpanManager.initializeNewContext: 111,767ns 2018-11-07 17:15:05.864 TRACE 27441 --- [ traced-2] nl.talsmasoftware.context.Timing : nl.talsmasoftware.context.ContextSnapshot.reactivate: 769,370ns 2018-11-07 17:15:06.117 INFO 27441 --- [ traced-2] i.j.internal.reporters.LoggingReporter : Span reported: 7d7aff76f16affbe:7d7aff76f16affbe:0:1 - command 2018-11-07 17:15:06.117 TRACE 27441 --- [ traced-2] n.t.context.delegation.RunnableAdapter : Call result ignored by RunnableAdapter: null 2018-11-07 17:15:06.223 INFO 27441 --- [nio-8080-exec-2] i.j.internal.reporters.LoggingReporter : Span reported: 84a0ac2cb1d3159e:84a0ac2cb1d3159e:0:1 - createAssetDefinition
@dzonekl Yes, my library uses the GlobalTracer
.
note: This repository is about opentracing and maybe folks who are not interested in my library may appreciate it if we move our discussion there?
@sjoerdtalsma I feel this question is more appropriate to be discussed in the opentracing spring cloud community git repo. As you might have noticed from the log, the async request is actually invoking Grpc to communicate over the wire, where Netty is used and a default executor. My suspicion is that, this is where the context is not propagated. So the question is really how can I instrument Grpc with Opentracing (I believe Grpc uses opensensus). Thanks for all your help sofar, I learned a few things :-)
You need to configure the executor pools used by futures to be instrumented so that they propagate the span to new threads.
May I know how to achieve that?
You need to configure the executor pools used by futures to be instrumented so that they propagate the span to new threads.
May I know how to achieve that?
See my link above
Anyone figured how to use
https://github.com/opentracing-contrib/java-concurrent
?
So we need to specify the threadpool everytime we're running thenCompose / thenApply? I'm using aws async client to spawn new thread using the traced threadpool. Example
ExecutorService tracedExecutorService = new TracedExecutorService(
Executors.newFixedThreadPool(50),
tracer
);
S3AsyncClient.builder()
.overrideConfiguration(builder -> builder.addExecutionInterceptor(new TracingExecutionInterceptor(this.tracer)))
.asyncConfiguration(builder -> builder.advancedOption(
SdkAdvancedAsyncClientOption.FUTURE_COMPLETION_EXECUTOR,
tracedExecutorService
)
)
...
Edit : even this does not work
When doing requests in async fashion using Completable futures, the active span or Scope is not available throughout the subsequent steps of the async. chain.
So:
I am sure this is a known limitation of the default ScopeManager, which is the ThreadLocalScopeManager, what are the alternatives?