quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.8k stars 2.68k forks source link

OpenTelemetry Context is null in Rest Client filter (using Azure App Insights agent) #35034

Open Orbifoldt opened 1 year ago

Orbifoldt commented 1 year ago

Describe the bug

The OpenTelemetryClientFilter is causing a NullPointerException. In particular, the spanContext as retrieved from the client request context is null: https://github.com/quarkusio/quarkus/blob/8f0e94e0238ba4ee75d2cbaada1c72be01edac43/extensions/opentelemetry/runtime/src/main/java/io/quarkus/opentelemetry/runtime/tracing/intrumentation/restclient/OpenTelemetryClientFilter.java#L121

This issue seems only to occur when using the Microprofile Rest Client (resteasy classic) in order to invoke some other service. The server itself does not have this issue when it is called, nor do the autoinstrumented sdks that we use.

I am using the Azure Application Insights Java Agent and the Quarkus OpenTelemetry extension. Project is written in Kotlin. I was using a customized propagator, but also with the default ones this error occurs. This issue started occurring after we upgraded from Quarkus 2.16 to 3.2.

Full stack trace:

java.lang.NullPointerException: Cannot invoke "io.opentelemetry.context.Context.get(io.opentelemetry.context.ContextKey)" because "context" is null
    at io.opentelemetry.instrumentation.api.instrumenter.http.HttpClientResend.getAndIncrement(HttpClientResend.java:36)
    at io.opentelemetry.instrumentation.api.instrumenter.http.HttpClientAttributesExtractor.onEnd(HttpClientAttributesExtractor.java:161)
    at io.opentelemetry.instrumentation.api.instrumenter.Instrumenter.doEnd(Instrumenter.java:224)
    at io.opentelemetry.instrumentation.api.instrumenter.Instrumenter.end(Instrumenter.java:144)
    at io.quarkus.opentelemetry.runtime.tracing.intrumentation.restclient.OpenTelemetryClientFilter.filter(OpenTelemetryClientFilter.java:123)
    at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.filterResponse(ClientInvocation.java:667)
    at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.invoke(ClientInvocation.java:428)
    at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invokeSync(ClientInvoker.java:134)
    at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invoke(ClientInvoker.java:103)
    at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientProxy.invoke(ClientProxy.java:61)
    at jdk.proxy3/jdk.proxy3.$Proxy175.getCaptureResults(Unknown Source)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.jboss.resteasy.microprofile.client.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:168)
    at jdk.proxy3/jdk.proxy3.$Proxy178.getCaptureResults(Unknown Source)
    at [... my application classes which invoke the rest client ...]
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.quarkus.hibernate.validator.runtime.interceptor.AbstractMethodValidationInterceptor.validateMethodInvocation(AbstractMethodValidationInterceptor.java:71)
    at io.quarkus.hibernate.validator.runtime.jaxrs.JaxrsEndPointValidationInterceptor.validateMethodInvocation(JaxrsEndPointValidationInterceptor.java:38)
    at io.quarkus.hibernate.validator.runtime.jaxrs.JaxrsEndPointValidationInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
    at io.quarkus.resteasy.runtime.QuarkusRestPathTemplateInterceptor.restMethodInvoke(QuarkusRestPathTemplateInterceptor.java:39)
    at io.quarkus.resteasy.runtime.QuarkusRestPathTemplateInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
    at io.quarkus.security.runtime.interceptor.SecurityHandler.handle(SecurityHandler.java:47)
    at io.quarkus.security.runtime.interceptor.RolesAllowedInterceptor.intercept(RolesAllowedInterceptor.java:29)
    at io.quarkus.security.runtime.interceptor.RolesAllowedInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
    at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
    at [... my application classes ...]
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:154)
    at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:118)
    at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:560)
    at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:452)
    at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$2(ResourceMethodInvoker.java:413)
    at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:321)
    at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:415)
    at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:378)
    at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:356)
    at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:70)
    at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:429)
    at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:240)
    at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:154)
    at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:321)
    at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:157)
    at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:229)
    at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:82)
    at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:147)
    at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$1.run(VertxRequestHandler.java:93)
    at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:576)
    at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2513)
    at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1538)
    at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29)
    at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:833)

Expected behavior

Rest client is instrumented and does not throw exceptions

Actual behavior

NullPointerException is thrown

How to Reproduce?

I haven't been able to reproduce this locally using the Azure app insights agent, there are some network/proxy settings that seem to prohibit me from getting it to work. Using the jaeger-all-in-one docker container (as described in the documentation) locally did not give above error.

Output of uname -a or ver

Linux 5.4.0-1111-azure #117~18.04.1-Ubuntu SMP Wed Jun 21 15:44:28 UTC 2023 x86_64 Linux

Output of java -version

openjdk version "17.0.8" 2023-07-18 LTS

GraalVM version (if different from Java)

n/a

Quarkus version or git rev

3.2.1.Final

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

quarkus-bot[bot] commented 1 year ago

/cc @Sgitario (rest-client), @brunobat (opentelemetry), @cescoffier (rest-client), @geoand (rest-client), @radcortez (opentelemetry)

geoand commented 1 year ago

cc @jamezp

brunobat commented 1 year ago

This instrumentation is quite old. There might be something going on in the propagation of REST_CLIENT_OTEL_SPAN_CLIENT_CONTEXT

jamezp commented 1 year ago

It definitely looks like something is likely not being propagated. I'm not too sure how we'd figure that out without a reproducer. Or I should say maybe I'm not sure how I would :)

Orbifoldt commented 1 year ago

Thanks for your replies @brunobat and @jamezp . I'll try working on some minimal reproducer.

In the mean time, could you maybe point me to where the REST_CLIENT_OTEL_SPAN_CLIENT_CONTEXT should be initialized? Then I can attach a debugger and try to find where it goes wrong

jamezp commented 1 year ago

@Orbifoldt Awesome, thanks!

It looks like in the ClientRequestFilter portion https://github.com/quarkusio/quarkus/blob/8f0e94e0238ba4ee75d2cbaada1c72be01edac43/extensions/opentelemetry/runtime/src/main/java/io/quarkus/opentelemetry/runtime/tracing/intrumentation/restclient/OpenTelemetryClientFilter.java#L99.

Orbifoldt commented 1 year ago

The spanContext is not properly initialized, it's already null when its added as a property to the request. This seems to originate from persisting the span into the OpenTelemetry Context.

In debugging I didn't get very far in finding the root cause of this. But maybe this means something to someone, so this is what I found:

Orbifoldt commented 1 year ago

@jamezp I've set up a reproducer here: https://github.com/Orbifoldt/quarkus-otel-azure-npe-reproduction Seems like you don't even need a valid Applicaiton Insights resource to have this error happen

franden commented 1 year ago

we have the same issue with Azure Fuctions + JAXRS client + OpenTelemetry. Any idea if this bug will be fixed in near future?

brunobat commented 1 year ago

It's scheduled for next quarter but is open for grabs.

franden commented 8 months ago

@brunobat the issue is till open. The last quarter of 2023 is over. Is there an updated timeline?

brunobat commented 8 months ago

Sorry, right now there is no estimate. Contributions are welcome!

johnrengelman commented 3 months ago

I'm encountering this same issue but with Ratpack. I've traced this down to something in the SpanKey AOP code, but since you can't debug AOP injections, I'm stuck there. I can see my debugger getting into this method, but I get to at least the Bridging.toAgentOrNull, but can't definitely see what it exits from there.

Some additional info for me - I'm using Ratpack 1.9 and I have both the javaagent and the instrumentation library added - it seems this is needed for Ratpack, because you have to manually wire the HttpClient (cannot find any documentation specifying not to do this or even what the appropriate configuration is for this library).

If I remove the javaagent, then this error doesn't occur. If I remove the library, the error doesn't occur, but that's because the HttpClient instrumentation is not being done at all.