helidon-io / helidon

Java libraries for writing microservices
https://helidon.io
Apache License 2.0
3.52k stars 564 forks source link

The current active span is not considered while creating a child span in WebClientTracing #5216

Open iampratapak opened 2 years ago

iampratapak commented 2 years ago

The current active span is not considered while creating a child span in WebClientTracing. It's always referring to the span context that exists in the server request.

Environment Details


Problem Description

I'm creating the first span as a child of the span context that exists within the server request. The second span I'm creating as a child of the first span. The code where the second span is active is making the outbound call.

The outbound call is intercepted by WebClientTracing and creates a span. Instead of reading the current active span from the scope manager, it's reading the span context that exists in the server request and creating a new span.

With this tracing tree, the parent-child relationship is not correct.

[WebClientTracing.java](https://github.com/helidon-io/helidon/blob/helidon-2.x/webclient/tracing/src/main/java/io/helidon/webclient/tracing/WebClientTracing.java#L61)

Expected:

image

Actual:

image

I'm doing the below workaround for now.

WebClient.builder()
                .addService(request -> {
                    Optional<Tracer> optionalTracer = request.context().get(Tracer.class);
                    Tracer tracer = optionalTracer.orElseGet(GlobalTracer::get);
                    final var activeSpan = tracer.scopeManager().activeSpan();
                    if (activeSpan != null) {
                        request.context().register(activeSpan.context());
                    }
                    return Single.just(request);
                })
                .addService(WebClientTracing.create())
                .build();

Steps to reproduce

  1. Enable WebClientTracing
  2. Create a span and use WebClient to make an outbound call.

Should I go ahead with the workaround since we don't have ScopeManager anymore in Helidon 3.X.X?

tjquinno commented 2 years ago

With Helidon SE there is “no magic” and the developer has the flexibility (and therefore the responsibility) of complete control.

You need to prepare some context so that the Helidon tracing code can find what it needs. Something like the following code in the service endpoint handler method should set things up so Helidon's code will create the span for the outbound WebClient request however you want:

        ServerRequest request = // the request delivered to your handler.

        // If present, use the tracer from the incoming request.
        Tracer tracer = request.context().get(Tracer.class).orElse(GlobalTracer.get());
        Tracer.SpanBuilder customSpanBuilder = tracer.buildSpan("my-span");

        // Set the parent of your span to the incoming request's span, if present.
        request.context().get(SpanContext.class).ifPresent(customSpanBuilder::asChildOf);

        Span span = customSpanBuilder.start();

        // Prepare the context for the outbound web client request.
        Context ctx = Context.create(request.context());
        ctx.register(tracer);
        ctx.register(span.context());

        WebClientRequestBuilder webClientRequestBuilder = // however you prepare it

        // Set the context you've prepared so Helidon's WebClient tracing code will use it.
        webClientRequestBuilder.context(ctx); 

        // Insert your code to send the request and deal with the response.
        ...
        // Remember to end the span.
        span.finish(); 

Two other notes:

  1. The existing Helidon WebClient tracing code looks for the tracer and span context settings which the example above prepares. If it finds none, it uses the currently active tracer to create a new span and does not explicitly set the new span's parent.

    We could consider as an enhancement that, in the absence of these context settings, the Helidon Webclient tracing code could use the currently-active span (if there is one) for the parent.

    For that to work, your code would need to set your span as the current/active one. Simply creating it would probably not be enough.

  2. Even without the code enhancement, we should probably add some content to the Helidon SE doc to explain how to set the tracer and span context explicitly as in the example above.
tjquinno commented 2 years ago

@tomas-langer and @Verdent Is there any reason we should not use the currently-active span as the parent in this use case, assuming that the span context is absent from the request context?

tjquinno commented 2 years ago

@tomas-langer Reminded me that, in Helidon SE and WebClient, span context is kept in ThreadLocal. There is no guarantee that the tracing information in ThreadLocal when the WebClient code is handling tracing is the same as what was current when the application initiated the WebClient work.

As a result, we do not want to use a possibly-incorrect "current" span.

tjquinno commented 10 months ago

Re-opening this issue for 4.x. In a virtual threads environment, as in 4.x, the outbound WebClient request will normally be on the same thread as the incoming request, so the WebClient request span could be parented to the current span if there is one.

RickyFrost commented 3 weeks ago

https://github.com/helidon-io/helidon/pull/8957 This was likely the missing "magic" referred to above, even for SE.