Closed will-sargent-eero closed 2 years ago
I should mention that this is happening in the context of a Play application, which is an asynchronous webframework all the way down, so anything based on ThreadLocal
(which it looks like ThreadLocalSegmentContext
is based on) is not going to work...
As it happens I'm also using Play, so hopefully I can help you out :) although in fact this should apply to any async / threaded code. (Also, forgive the very verbose answer, but I thought it might be useful to have a reference since this issue has been raised a couple of times).
I don't think this issue is specific to the X-Ray libraries, since the fundamental problem is that we need to somehow keep track of the current Span
, and there are only really two ways of doing this: either explicitly, or implicitly [1]. The same would be true of any other tracing library.
For single threaded code inside a single method, this is very easy: when the Span
is activated you get back a corresponding Scope
instance, and you can just call Scope.close()
.
For single threaded code which calls out to another method, and where the Span
should be completed inside that other method, you're probably going to need to pass the Scope
instance down as a method parameter.
For multi-threaded code you'll have to re-activate the current Span
inside your other thread by doing something like (again, assuming Scala):
val tracer: Tracer = ...
val outerScope: Scope = tracer.startSpan("span-name").startActive(false)
val someFuture: Future[_] = ...
// We immediately close the "outer" scope but this *doesn't* finish
// the `Span` (since we used `startActive(false)`). This prevents
// our `Span` from leaking into other code if this current thread
// gets reused in e.g. a thread pool
outerScope.close()
// NB use `transform` here (rather than `onComplete`) since we want
// to enforce that the `Span` is closed *before* the `Future` returns
// This helps ensure that e.g start and stop times are realistic
someFuture transform { result =>
// This sets re-activates the `Span` on the current thread which
// might be different from the thread where it was created above
val innerScope = tracer.scopeManager.activate(outerScope.span, true)
// Maybe do something else with `result`
// Since we called `activate` with `finishOnClose = true`, this will
// automatically finish the span and send it to X-Ray
innerScope.close()
result
}
Explicit passing has some drawbacks: even for vanilla code you end up passing a lot of Scope
parameters around. This is why the AWS X-Ray library uses ThreadLocal
by default [1]: it saves you having to pass the Scope
around by assuming (implicitly) that you're always going to want to tie the current thread to a specific Span
. Other similar solutions exist in other frameworks (e.g. request-scoped beans in spring).
For threaded code, the only viable solution I've found for Java / Scala is using aspect-oriented programming to automatically handle copying Span
instances across threads: typically, you want to
grab the "current" (active) Span
at the point where some Runnable
or Future
instance gets created
before the body of that instance gets run, reactivate the Span
onto the current thread
once you're finished, close it off again to deactivate the Span
from this thread
For example, there's a Scala-specific tracing library called kamon which custom AOP library called kanela to do just this; their implementation of weaving scala Future
automatically can be seen here.
I've had success with using Eclipse aspectj to do much the same, although that code is currently proprietary so I'm afraid I can't share it.
There's also the opentracing-contrib/java-specialagent library, although I haven't used that but it seems do something similar (runtime weaving of bytecode).
[1] Assuming you're using Scala, I mean "implicitly" in the more general sense rather than the Scala implicit
sense
[2] There's an exception to this in lambda where it does some additional magic, but that's a complication for another day...
I'm using Lightbend Cinnamon as a backend: https://developer.lightbend.com/docs/telemetry/current/extensions/opentracing/opentracing.html
It's using instrumentation to pass the active span as a continuation and automatically instrumenting futures and actions using Byte Buddy, which has the advantage that the opentracing API doesn't need to be exposed. I think it should be possible to customize it, but first I need to get X-Ray to never use a threadlocal ever.
Ahh so it sounds like you have Cinnamon / OpenTracing integration set up (to instrument your Play / akka / Future code), but using X-Ray as the tracer implementation? I've not used Cinnamon myself, but glancing through the docs it looks like:
in most cases it does something very similar to what's described above: bytecode weaving to copy some or all of the span data around
however in at least some cases it's using a custom version of Future
and related methods to handle tracing
This library mimics the ThreadLocal
behaviour of the X-Ray library in order to make sure that the internal states of both are kept consistent. It works for me with the "vanilla" weaving approach described above, as long as Span
s are always correctly activated and deactivated through the corresponding ScopeManager
/ Scope
objects.
The problem here is that we don't know how Cinnamon goes about activating and deactivating Span
s - if it's not calling ScopeManager
then all bets are off. Having said that, I would hope it is using those methods, since that's the intended design of the API: maybe worth raising the issue with Lightbend support? Unfortunately Cinnamon is (I believe) closed source, so it's not easily inspectable.
first I need to get X-Ray to never use a threadlocal ever
I suspect what you'd need here is a custom SegmentContext
implementation (along with the corresponding SegmentContextResolver
) and wiring those into your AWSXRayRecorder
instance: that would allow you to "resolve" the current context (i.e. the current span) from Cinnamon rather than from a ThreadLocal
.
Almost there, but I need an entity from AWSXRaySpan
and I can't get at it because it's package private.
It would be possible to make this library return AWS-specific types (e.g. AWSXRayTracer.activeSpan()
would return an AWSXRaySpan
, from which you could get the underlying Entity
).
However, I wonder if this is a code smell? I've certainly changed my mind on this a couple of times, but my current thinking is this: surely the whole point of using something like OT is that you can switch out different implementations without having to change anything? If your code is explicitly relying on AWS-specific types, you lose that flexibility. At the very least switching to, say, Jaeger or Zipkin might break your build; or worse it might compile, but fail at runtime - or even just silently skip sections of code (if e.g. you're using reflection / isInstanceOf
/ pattern match).
I've found it useful to expose some of the AWS-specific operations through the OT interface, as per the documentation: for example, calling Span.setTag("error", true)
actually calls Entity.setError(true)
under the covers, without needing direct access to AWS-specific classes. This approach is IMHO better than accessing Entity
directly: it will always compile, and will always run safely even with non-AWS implementations (although for non-AWS obviously these just revert to being "normal" tags).
Can I ask what you need to do directly on the Entity
? Would it be sufficient to expose additional special tag names for your purposes?
Hi @will-sargent-eero,
Prompted by https://github.com/opentracing-contrib/java-xray-tracer/issues/10 I realise this repo probably needs some TLC - did you manage to resolve this issue in the end? If so, would you mind documenting how here, for the benefit of any other Lightbend / Cinnamon users?
Thanks
I could not resolve this issue.
On Mon, Nov 9, 2020 at 5:04 AM Ashley Mercer notifications@github.com wrote:
Hi @will-sargent-eero https://github.com/will-sargent-eero,
Prompted by #10 https://github.com/opentracing-contrib/java-xray-tracer/issues/10 I realise this repo probably needs some TLC - did you manage to resolve this issue in the end? If so, would you mind documenting how here, for the benefit of any other Lightbend / Cinnamon users?
Thanks
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/opentracing-contrib/java-xray-tracer/issues/6#issuecomment-723999222, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMUFROT7PIEZANGW3KM75U3SO7SEXANCNFSM4I5VQCTA .
Tried out the Opentracing API integration today -- it loaded successfully, but showed this on finishing the request: