micrometer-metrics / tracing

Provides tracing abstractions over tracers and tracing system reporters.
https://micrometer.io
Apache License 2.0
247 stars 45 forks source link

Issue related to Spring Boot WebFlux and micrometer while the system uses different executors #643

Closed ForItInNet closed 4 months ago

ForItInNet commented 7 months ago

We tried to use micrometer tracing with WebFlux gRPC and, it looks like, the context doesn't populate between executors during the one request.

public Flux<AuthorityDto> getAuthority(@NonNull Long accountId){
        return Mono.just(accountId)
                .doOnNext(ignore -> log.info("========== Message 1 =========="))
                .flatMapMany(authorityRepository::getAuthority) <=== Request to DB to select a flux of authorities
                .take(1) // just for sample
                .doOnNext(ignore -> log.info("========== Message 2 =========="))
                ...;
 }

Result

INFO [6603fd55c4be0cc0d99287a970446e4a] --- [ault-executor-0] AuthorityService    : ========== Message 1 ==========
INFO [] --- [actor-tcp-nio-2] AuthorityService    : ========== Message 2 ==========

We can see that before DB-operation the log has a trace ID and executor ault-executor-0 but after selection, the race ID is empty and, as expected, we have another executor actor-tcp-nio-2.

There are dependencies which are used:

implementation 'org.springframework.boot:spring-boot-starter-actuator:3.2.4'
implementation 'io.micrometer:micrometer-tracing-bridge-brave:1.2.4'
implementation 'io.micrometer:context-propagation:1.1.1'

Spring Boot 3.1.5 The main method also has Hooks.enableAutomaticContextPropagation

public static void main(String[] args) {
        Hooks.enableAutomaticContextPropagation();
        SpringApplication.run(AuthorityServiceApplication.class, args);
}
ForItInNet commented 7 months ago

I have found that it should be propagated by hand like in this tutorial https://www.youtube.com/watch?v=6OKS36PSpho It can be something like this

public static void main(String[] args) {
        Hooks.enableAutomaticContextPropagation();
        SpringApplication.run(AuthorityServiceApplication.class, args);
    }

Interceptor for trace ID creation

@Slf4j
@RequiredArgsConstructor
@GrpcGlobalServerInterceptor
public class TracerInterceptor implements ServerInterceptor, ClientInterceptor {

    private final String TRACE_ID = "traceId";
    private final Metadata.Key<String> TRACE_ID_KEY = Metadata.Key.of(TRACE_ID, Metadata.ASCII_STRING_MARSHALLER);

    private final Tracer tracer;

    @Override
    public <ReqT, RespT> ServerCall.Listener<ReqT> interceptCall(ServerCall<ReqT, RespT> call, Metadata headers, ServerCallHandler<ReqT, RespT> next) {
        Optional.ofNullable(headers.get(TRACE_ID_KEY)).ifPresentOrElse(traceId -> MDC.put(TRACE_ID, traceId), () -> tracer.withSpan(tracer.nextSpan()));
        ContextRegistry.getInstance().registerThreadLocalAccessor(TRACE_ID, () -> MDC.get(TRACE_ID), traceId -> MDC.put(TRACE_ID, traceId), () -> MDC.remove(TRACE_ID));
        return next.startCall(call, headers);
    }
}

And add data into the context of your chain:

...
.contextCapture();

Is it expected that gRPC doesn't create trace ID automatically? How can we correctly propagate tracer context?

Thank you.

marcingrzejszczak commented 6 months ago

cc @chemicL @bclozel

chemicL commented 6 months ago

AFAIK, Spring has no official gRPC support. Whichever community supported project that the GrpcGlobalServerInterceptor comes from would need to support Observation just like Spring does for this to work without user configuration.

Some remarks regarding the code above:

  1. spring.reactor.context-propagation=auto Boot config can be used instead of directly calling Hooks.enableAutomaticContextPropagation()
  2. You should register a ThreadLocalAccessor once per app lifecycle instead of registering one on every call.
  3. I don't know about reactive support in your gRPC integration library but in reactive endpoints you should avoid directly writing to MDC and using contextCapture() afterwards, but instead write directly to the Reactor Context.
  4. Now you can use a dedicated ThreadLocalAccessor for such a use case.
github-actions[bot] commented 5 months ago

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

github-actions[bot] commented 4 months ago

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.