GoogleCloudPlatform / cloud-trace-java

Apache License 2.0
48 stars 42 forks source link

StartSpan & EndSpan from within different Threads #85

Open sebastian-garn opened 7 years ago

sebastian-garn commented 7 years ago

Hi,

my plan is to integrate tracing into our existing backend. All the request handling is currently accomplished by a chain of CompletionStages. My intention now was to simply hook in the tracing by wrapping the existing chains like shown below (exemplary!):

  CompletableFuture.runAsync(() -> {
      // tracer.startSpan("abc");
  }, tracerPool)
  .thenCompose(x -> {
      return completionStage; // <-- stuff to measure
  })
  .thenApplyAsync(result -> {
      // tracer.endSpan(traceContextChild);
      return result;
  }, tracerPool)

What might happen over here is that startSpan and endSpan get executed in different threads. But since the information about the last context is stored within a thread local variable, stopping the span will fail with the following error:

Context was not attached when detaching
java.lang.Throwable
  at io.grpc.Context.detach(Context.java:353)
  at com.google.cloud.trace.GrpcSpanContextHandler$GrpcSpanContextHandle.detach(GrpcSpanContextHandler.java:64)
  at com.google.cloud.trace.SpanContextHandlerTracer.endSpan(SpanContextHandlerTracer.java:92)

Do I miss something?

danielcompton commented 7 years ago

This looks like it will be a blocker for us, as we're using Netty for web requests and the thread that initially handles the request may not be the one that completes it if there are async actions in-between.

jstjohn commented 6 years ago

FYI in another project I hacked together one solution to this issue that got me unblocked.

Basically the problem is that https://github.com/GoogleCloudPlatform/cloud-trace-java/blob/master/sdk/implementation/src/main/java/com/google/cloud/trace/GrpcSpanContextHandler.java uses a threadlocal way of simply storing and returning the current SpanContext across scopes. Literally all that method does is store/return/update a copy of the current spanContext. The stuff w/ detaching the parent in there seems to be related to their method of storage more than any functionality (ie I have things working properly/hierarchically without the concept of a parent in my implementation of that class).

If you just use scoping to know which SpanContext should be used as current when you call EndSpan etc, you can write a very simple implementation of SpanContextHandler that just stores the current SpanContext and returns it when asked. For the attach command you just update the SpanContext stored by that class, and return a new DetachedSpenContextHandler with that new SpanContext.

What I do then is I create a new SpanContextHandler (using my new "simple" version described above rather than the GRPC one) as well as SpanContextHandlerTracer. You can set the parent span if you have one (ie parsed from an Http Header), or use the SpanContextFactory.initialContext() (or whatever that was) and pass that into your SimpleSpanContextHandler and then when you call startSpan on the SpanContextHandlerTracer you get appropriate nesting.

Documentation etc sucks, it took me a long time and a lot of reading through code and understanding how the various wrappers work to figure this out. I don't know that this is the best solution but it worked for me in a threaded context (in akka-http/scala even).

Also I should note that I am still using GRPC as the trace sink as shown in the various examples provided in this repo, and having no problems yet. I am basically following this example https://github.com/GoogleCloudPlatform/cloud-trace-java/blob/master/samples/managed-grpc/src/main/java/com/google/cloud/trace/samples/grpc/managed/ManagedGrpc.java but returning the tracer and context returned from the tracer.startSpan from a start trace method, then doing my stuff, then calling the tracer.endSpan(context) when done (maybe from a different thread).

As a side note, we just released our scala library for tracing with akka-http. We have included a google stackdriver trace tracer as well as simple logging tracers to start. It's fairly minimal code and would be reasonably easy to get working with other systems likely. https://github.com/drivergroup/tracing

codefromthecrypt commented 6 years ago

Might consider census and/or its instrumentation agent? https://github.com/census-instrumentation/opencensus-java/tree/master/contrib/agent

On 22 Sep 2017 01:02, "John St. John" notifications@github.com wrote:

FYI in another project I have a solution to this issue.

Basically the problem is that https://github.com/ GoogleCloudPlatform/cloud-trace-java/blob/master/sdk/ implementation/src/main/java/com/google/cloud/trace/ GrpcSpanContextHandler.java uses a threadlocal way of simply storing and returning the current SpanContext across scopes. If you use scoping to know which spanContext should be used as current when you call EndSpan etc, you can write a very simple implementation of SpanContextHandler that just stores the current SpanContext and returns it when asked. For the attach command you just update the context stored by that class, and return a new DetachedSpenContextHandler with that SpanContext.

What I do then is I create a new SpanContextHandler (using my new "simple" version described above rather than the GRPC one) as well as SpanContextHandlerTracer. You can set the parent span and pass that into your SimpleSpanContextHandler and then when you call startSpan on the SpanContextHandlerTracer you get appropriate nesting.

Documentation etc sucks, it took me a long time and a lot of reading through code and understanding how the various wrappers work to figure this out. I don't know that this is the best solution but it worked for me in a threaded context (in akka-http/scala even).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/cloud-trace-java/issues/85#issuecomment-331219005, or mute the thread https://github.com/notifications/unsubscribe-auth/AAD610UZmqfJ68bCjvq8tQtF96om7HHMks5skpZVgaJpZM4NRyV1 .

nha commented 5 years ago

@adriancole Does opencensus java support async? I've been looking for a solution too https://stackoverflow.com/q/53238493/1327651

bogdandrutu commented 5 years ago

Hi @nha,

In OpenCensus we have a concept of context independent of the "Span" or "Tags". It represents a Map<key, value> that is propagated with the request (it is implemented as a thread-local so in sync calls automatically gets propagated). For callbacks/async calls just for propagation (we are using io.grpc.Context as the implementation of the context) use the wrap functions defined here https://github.com/grpc/grpc-java/blob/master/context/src/main/java/io/grpc/Context.java#L589. This will ensure just the context propagation, so entries in the context map will be propagated between different threads.

If you want to start a Span in one thread and end it in a different thread, use the withSpan methods from the tracer https://www.javadoc.io/doc/io.opencensus/opencensus-api/0.17.0 :

class MyClass {
  private static Tracer tracer = Tracing.getTracer();
  void handleRequest(Executor executor) {
    Span span = tracer.spanBuilder("MyRunnableSpan").startSpan();
    // do some work before scheduling the async
    executor.execute(Context.wrap(tracer.withSpan(span, new Runnable() {
      @Override      
      public void run() {
        try {
          sendResult();
        } finally {
          span.end();
        }
      }
    })));
  }
}

A bit more information about this here https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Span.md#span-creation

ruchirsachdeva commented 5 years ago

@bogdandrutu I am using java 8 's CompletionStage thenComposeAsync (Function<? super T, ? extends CompletionStage> fn); How can I trace the the passed function using Context.wrap withSpan such that the parent span is maintained across different threads ?

bogdandrutu commented 5 years ago

@ruchirsachdeva can you open an issue in the OpenCensus Java repo?

codefromthecrypt commented 5 years ago

side comment for your future issue in census

future tracking is trickier than it looks. I think @sjoerdtalsma probably has nightmares from doing this, I'd bet :P https://github.com/talsma-ict/context-propagation

On Thu, Jul 11, 2019 at 11:14 PM ruchirsachdeva notifications@github.com wrote:

I am using java 8 's CompletionStage thenComposeAsync (Function<? super T, ? extends CompletionStage> fn); How can I trace the the passed function using Context.wrap withSpan such that the parent span is maintained across different threads ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/cloud-trace-java/issues/85?email_source=notifications&email_token=AAAPVV6FGZ5ZWUDGQMC4UHDP6454DA5CNFSM4DKHEV22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZW2V6Y#issuecomment-510503675, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAPVV5TCEKGFJ2IAXYYKVTP6454DANCNFSM4DKHEV2Q .

sjoerdtalsma commented 5 years ago

@adriancole indeed. Even determining logical behaviour is tricky. When combining two futures that both have context, which context applies to the combine function?

I went for the context from the ‘this’ future.

See ContextAwareCompletabeFuture in the above library and test class to get an idea how tricky this is indeed.

I have no fond memories of tackling the CompletionStages, but we do have something that seems workable.

ruchirsachdeva commented 5 years ago

I added the feature request in opencensus-java repo https://github.com/census-instrumentation/opencensus-java/issues/1949