Netflix / dgs-framework

GraphQL for Java with Spring Boot made easy.
https://netflix.github.io/dgs
Apache License 2.0
3.03k stars 286 forks source link

bug: dgsMicrometerContextRegistry's Slf4jThreadLocalAccessor conflicts with ObservationThreadLocalAccessor #1888

Open martinvisser opened 2 months ago

martinvisser commented 2 months ago

Expected behavior

When using MDC values, even if it is just the traceId and/or spanId - the defaults from Spring Boot -, they should be logged if ObservationRegistry is used. At least, I think it is related to using ObservationRegistry.

Actual behavior

No MDC values are logged.

Steps to reproduce

I'm using Spring Boot 3.2.5, with graphql-dgs-webflux-starter. I upgraded graphql-dgs-platform-dependencies to version 8.5.5 or 8.5.6, coming from 8.5.3, when I noticed all MDC values were gone. I search for changes made and found a related change that added this bean:

@Bean
@ConditionalOnMissingBean
open fun dgsMicrometerContextRegistry(): ContextRegistry {
    return ContextRegistry.getInstance()
        .registerThreadLocalAccessor(Slf4jThreadLocalAccessor())
}

This adds a second ThreadLocalAccessor to my application, which fully overwrites the first entry, the ObservationThreadLocalAccessor. Before 8.5.5 there was no ContextRegistry, and as this bean is triggered, there still isn't one. However, the ContextRegsitry did load the ObservationThreadLocalAccessor already.

I found a workaround by removing Slf4jThreadLocalAccessor later on, which brought back all MDC values:

@Configuration
class AppConfiguration(contextRegistry: ContextRegistry?) {
    init {
        contextRegistry?.run {
            removeThreadLocalAccessor(Slf4jThreadLocalAccessor.KEY)
        }
    }
}

I don't know if it's an option to register Slf4jThreadLocalAccessor with a condition perhaps, or maybe have an option to disable it in a different way, but anything would be appreciated so that I do not need to use the workaround.

paulbakker commented 2 months ago

Are you explicitly creating one in your app (I think that's what you're saying) or is it something coming implicitly from Webflux?

If it's your own, do you create a bean of type ContextRegistgry as well? The DGS provided one is @ConditionalOnMissingBean so you can override it by providing one yourself. Does that work?

I'm trying to fully understand the scenario to figure out if we should have other conditionals (or config) for this.

martinvisser commented 2 months ago

We don't create or try to load our own, and I'm not entirely sure where it came from. The ObervationRegistry is autowired. The DGS bean is loaded, so that means there isn't any other, based on the conditional on the DGS one. I haven't tried creating our own context registry bean, but can try, although we never needed one.

martinvisser commented 2 months ago

This works too indeed:

@Bean
fun contextRegistry(): ContextRegistry = ContextRegistry()

But obviously this just hides the DGS bean

martinvisser commented 2 months ago

The one that already loads the ObservationThreadLocalAccessor comes from io.micrometer:micrometer-observation from META-INF/services/io.micrometer.context.ThreadLocalAccessor:

io.micrometer.observation.contextpropagation.ObservationThreadLocalAccessor

And is loaded via a ServiceLoader in ContextRegistry from:

private static final ContextRegistry instance = new ContextRegistry().loadContextAccessors()
        .loadThreadLocalAccessors();
// ...
public ContextRegistry loadThreadLocalAccessors() {
    ServiceLoader.load(ThreadLocalAccessor.class).forEach(this::registerThreadLocalAccessor);
    return this;
}
martinvisser commented 2 months ago

Probably relevant, it's ContextPropagationSupport that loads the class, which is triggered because we have Hooks.enableAutomaticContextPropagation() in our Spring Boot app:

fun main(vararg args: String) {
    Hooks.enableAutomaticContextPropagation()
    runApplication<MySecondApplication>(*args)
}

ContextPropagationSupport comes from io.projectreactor:reactor-core