DataDog / dd-trace-java

Datadog APM client for Java
https://docs.datadoghq.com/tracing/languages/java
Apache License 2.0
564 stars 282 forks source link

Custom span on reactive stack with Reactor: "Scope depth limit exceeded" error in case of increasing load #3483

Open tatsiana-tarhonskaya opened 2 years ago

tatsiana-tarhonskaya commented 2 years ago

Hello, I'm using the latest 0.98.1 datadog agent and OpenTracing 0.330.

I've added custom span for reactive method, and it works as expected in case of low request number, but in case of higher load there's an issue - activating spans results in error about scope depth limit - and span is not actually activated as the result: [dd.trace 2022-04-06 14:15:01:844 +0000] [reactor-http-epoll-3] DEBUG datadog.trace.agent.core.scopemanager.ContinuableScopeManager - Scope depth limit exceeded (100). Returning NoopScope.

I've looked at implementation of ContinuableScopeManager https://github.com/DataDog/dd-trace-java/blob/2359895eb1d96aaddefd9205a4f7a36dfa909537/dd-trace-core/src/main/java/datadog/trace/core/scopemanager/ContinuableScopeManager.java#L473 and as far as I understood depth limit in error message corresponds to the number of nested spans. If it's so, my application does not produce so many nested spans and limit should not be reached.

Do you have ideas why is this happening and how to fix this issue (my thought is that ThreadLocal is used for ScopeStack, probably it doesn't work as expected with Reactor https://github.com/DataDog/dd-trace-java/blob/2359895eb1d96aaddefd9205a4f7a36dfa909537/dd-trace-core/src/main/java/datadog/trace/core/scopemanager/ContinuableScopeManager.java#L43)?

Here's code for creating custom span:

    private Mono createSpanForMono(ProceedingJoinPoint joinPoint, TraceOperation traceOperation)
    {
        Tracer tracer = GlobalTracer.get();
        Span span = createSpan(tracer, joinPoint, traceOperation);
        AtomicReference<Scope> scope = new AtomicReference<>();
        try
        {
            return ((Mono) joinPoint.proceed())
                    .doFirst(() ->
                    {
                        Scope sc = tracer.activateSpan(span);
                        scope.set(sc);
                    })
                    .doFinally(signalType ->
                    {
                        span.finish();
                        Scope scopeValue = scope.get();
                        if (scopeValue != null)
                        {
                            scopeValue.close();
                        }
                    });
        }
        catch (Throwable e)
        {
            return Mono.error(e);
        }
    }

    private Span createSpan(Tracer tracer, ProceedingJoinPoint joinPoint, TraceOperation traceOperation)
    {
        String operationName = traceOperation.operationName();
        if (StringUtils.isEmpty(operationName))
        {
            Signature signature = joinPoint.getSignature();
            operationName = signature.getDeclaringTypeName() + "." + signature.getName();
        }
        return tracer.buildSpan(operationName).start();
    }

Thanks!

mcculls commented 2 years ago

Hi @suneli - scopes are meant to be associated with work on a thread, they are not meant to be passed between threads.

The OpenTracing spec states:

Passing of scopes to another thread or callback is not supported

https://opentracing.io/docs/overview/scopes-and-threading/

and:

in Java, Scope represents the active state for the current context (usually a thread), which cannot be naturally, automatically propagated to other contexts (usually other threads). Consider passing Span instances between threads, or else create a new one for each operation happening in a different thread.

https://opentracing.io/guides/java/scopes/#thread-safety

tatsiana-tarhonskaya commented 2 years ago

Hi @mcculls , thank you for response. Regarding passing the scopes - do you mean that my code which is closing scope in "doFinally" block is illegal since it can be executed by the thread different from the one where it was activated?

mcculls commented 2 years ago

That's right - scopes must be closed on the same thread where they were activated. You can share spans between threads, but not scopes.

tatsiana-tarhonskaya commented 2 years ago

Thank you @mcculls , I see it now. And probably you could suggest how to instrument reactive methods than? Is there some approach?

vsingal-p commented 1 year ago

^ Do we have a way to add custom spans in reactive application? I am using springboot + reactor