micronaut-projects / micronaut-tracing

Distributed Tracing Support
Apache License 2.0
18 stars 27 forks source link

Span.makeCurrent() causes http server spans to not be created #475

Open CraftyFella opened 10 months ago

CraftyFella commented 10 months ago

Expected Behavior

I should be able to use the open telemetry tracer and create child spans which I can set to the current span. This should NOT impact the creation of the http server spans. This seems to happen when using the reactor library.

Actual Behaviour

It does impact it and they no longer get created.

Steps To Reproduce

I have created a repo which reproduces the issue see

https://github.com/CraftyFella/micronaut-otel-http-server-span-bug?tab=readme-ov-file#to-run-the-app

Environment Information

Mac-OS openjdk 17.0.5 2022-10-18 LTS micronautVersion=4.2.3 kotlinVersion=1.9.21

Example Application

https://github.com/CraftyFella/micronaut-otel-http-server-span-bug

Version

4.2.3

CraftyFella commented 10 months ago

So we've worked out why it's broken since upgrading to micronaut 4. it's due to this https://docs.micronaut.io/latest/guide/#reactorContextPropagation

Since Micronaut Framework version 4, Project Reactor integration no longer captures the state automatically. Micronaut Framework users need to extend the propagation context manually. Before version 4, Micronaut Framework required the instrumentation of every reactive operator to capture the current state to propagate it. It added an unwanted overhead and forced us to maintain complicated Reactor operators' instrumentation.

n0tl3ss commented 4 weeks ago

We probably need @dstepanov help around this issue. In your demo app if you change your findAllBooks into

    fun findAllBooks(): Mono<List<Book>> {
        var oldContext = Context.current()
        val startSpan = tracer.spanBuilder("findAllBooks").startSpan()
        val makeCurrent = startSpan.makeCurrent()

        // Having a delay even for 0 MS causes the bug
        // Replacing with mono.just doesn't cause the bug
        PropagatedContext.get().plus(OpenTelemetryPropagationContext(oldContext)).propagate()
        return Mono.delay(Duration.ofMillis(0)).map { books }
            .doOnNext {
                makeCurrent.close()
                startSpan.end()
            }

    }

I think the reason that you have to do is this line startSpan.makeCurrent() changes current telemetry context and the old one keeps dangling and its never closed.