The issue is certainly present in Vert.x 4.5.0, I haven't tried to reproduce it on older versions.
Context
The OpenTelemetry Span exporters (reproduced with Jaeger gRPC and OTLP gRPC) started throwing NullPointerException after updating to OpenTelemetry 1.32.0. The exception states that the vertxCtx is null.
java.lang.NullPointerException: Cannot invoke "io.vertx.core.Context.getLocal(Object)" because "vertxCtx" is null
at io.vertx.tracing.opentelemetry.VertxContextStorageProvider$VertxContextStorage.attach(VertxContextStorageProvider.java:27) ~[vertx-opentelemetry-4.5.0.jar:4.5.0]
at io.vertx.tracing.opentelemetry.VertxContextStorageProvider$VertxContextStorage.attach(VertxContextStorageProvider.java:23) ~[vertx-opentelemetry-4.5.0.jar:4.5.0]
at io.opentelemetry.context.Context.makeCurrent(Context.java:202) ~[opentelemetry-context-1.32.0.jar:1.32.0]
at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:211) ~[opentelemetry-context-1.32.0.jar:1.32.0]
at io.opentelemetry.exporter.internal.InstrumentationUtil.suppressInstrumentation(InstrumentationUtil.java:28) ~[opentelemetry-exporter-common-1.32.0.jar:1.32.0]
at io.opentelemetry.exporter.sender.okhttp.internal.OkHttpGrpcSender.send(OkHttpGrpcSender.java:116) ~[opentelemetry-exporter-sender-okhttp-1.32.0.jar:1.32.0]
at io.opentelemetry.exporter.internal.grpc.GrpcExporter.export(GrpcExporter.java:61) ~[opentelemetry-exporter-common-1.32.0.jar:1.32.0]
at io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter.export(OtlpGrpcSpanExporter.java:73) ~[opentelemetry-exporter-otlp-1.32.0.jar:1.32.0]
at io.opentelemetry.sdk.trace.export.SimpleSpanProcessor.onEnd(SimpleSpanProcessor.java:81) ~[opentelemetry-sdk-trace-1.32.0.jar:1.32.0]
at io.opentelemetry.sdk.trace.SdkSpan.endInternal(SdkSpan.java:451) ~[opentelemetry-sdk-trace-1.32.0.jar:1.32.0]
at io.opentelemetry.sdk.trace.SdkSpan.end(SdkSpan.java:431) ~[opentelemetry-sdk-trace-1.32.0.jar:1.32.0]
at io.vertx.tracing.opentelemetry.OpenTelemetryTracer.end(OpenTelemetryTracer.java:97) ~[vertx-opentelemetry-4.5.0.jar:4.5.0]
at io.vertx.tracing.opentelemetry.OpenTelemetryTracer.receiveResponse(OpenTelemetryTracer.java:143) ~[vertx-opentelemetry-4.5.0.jar:4.5.0]
at io.vertx.tracing.opentelemetry.OpenTelemetryTracer.receiveResponse(OpenTelemetryTracer.java:31) ~[vertx-opentelemetry-4.5.0.jar:4.5.0]
at io.vertx.kafka.client.common.tracing.ProducerTracer$StartedSpan.finish(ProducerTracer.java:89) ~[vertx-kafka-client-4.5.0.jar:4.5.0]
at io.vertx.kafka.client.producer.impl.KafkaWriteStreamImpl.lambda$null$2(KafkaWriteStreamImpl.java:111) ~[vertx-kafka-client-4.5.0.jar:4.5.0]
at org.apache.kafka.clients.producer.KafkaProducer$AppendCallbacks.onCompletion(KafkaProducer.java:1505) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:270) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:234) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.ProducerBatch.complete(ProducerBatch.java:180) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:692) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:663) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.Sender.lambda$null$1(Sender.java:589) ~[kafka-clients-3.5.0.jar:?]
at java.util.ArrayList.forEach(Unknown Source) ~[?:?]
at org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:576) ~[kafka-clients-3.5.0.jar:?]
at java.lang.Iterable.forEach(Unknown Source) ~[?:?]
at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:576) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$5(Sender.java:850) ~[kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:154) [kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:594) [kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:586) [kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:328) [kafka-clients-3.5.0.jar:?]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:243) [kafka-clients-3.5.0.jar:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Do you have a reproducer?
No, but I pinned down the issue pretty clearly. The stack trace above occurs on the Kafka Producer Network Thread. This thread is obviously not a Vert.x thread (also none with virtual threads support, as added in Vert.x 4.5.0), so it is not expected to have a Vert.x context. Since OpenTelemetry 1.32.0, a new utility is added, which suppresses instrumentation for certain blocks of code. This is achieved by putting a special OpenTelemetry Context into scope, which contains a flag to disable instrumentation:
This utility is used around the network calls of the span exporters, and thus tries to bring the special context into scope. Because there is no Vert.x context in this thread, the code at VertxContextStorageProvider.java:27 throws the NullPointerException.
As an additional remark, this could also hint at an issue in the Vert.x Kafka Client. I would expect the KafkaWriteStreamImpl to delegate back to the Event Loop right in between these stack frames:
at io.vertx.kafka.client.producer.impl.KafkaWriteStreamImpl.lambda$null$2(KafkaWriteStreamImpl.java:111) ~[vertx-kafka-client-4.5.0.jar:4.5.0]
at org.apache.kafka.clients.producer.KafkaProducer$AppendCallbacks.onCompletion(KafkaProducer.java:1505) ~[kafka-clients-
but that doesn't appear to happen. Would this be worth investigating?
Proposed Solution
I would suggest that VertxContextStorage delegates to OpenTelemetry's default ThreadLocalContextStorage when called on a thread where there's no Vert.x context. This would enable OpenTelemetry tracing also on non-Vert.x threads while not violating any Vert.x guarantees. What do you think?
Version
The issue is certainly present in Vert.x 4.5.0, I haven't tried to reproduce it on older versions.
Context
The OpenTelemetry Span exporters (reproduced with Jaeger gRPC and OTLP gRPC) started throwing
NullPointerException
after updating to OpenTelemetry 1.32.0. The exception states that thevertxCtx
isnull
.Do you have a reproducer?
No, but I pinned down the issue pretty clearly. The stack trace above occurs on the Kafka Producer Network Thread. This thread is obviously not a Vert.x thread (also none with virtual threads support, as added in Vert.x 4.5.0), so it is not expected to have a Vert.x context. Since OpenTelemetry 1.32.0, a new utility is added, which suppresses instrumentation for certain blocks of code. This is achieved by putting a special OpenTelemetry Context into scope, which contains a flag to disable instrumentation:
https://github.com/open-telemetry/opentelemetry-java/compare/v1.31.0...v1.32.0#diff-e0f86f9c669c5dfdee741804e40a62f9b122ec756c2df1acc56a8d7a9ff3247fR28
This utility is used around the network calls of the span exporters, and thus tries to bring the special context into scope. Because there is no Vert.x context in this thread, the code at
VertxContextStorageProvider.java:27
throws theNullPointerException
.As an additional remark, this could also hint at an issue in the Vert.x Kafka Client. I would expect the
KafkaWriteStreamImpl
to delegate back to the Event Loop right in between these stack frames:but that doesn't appear to happen. Would this be worth investigating?
Proposed Solution
I would suggest that
VertxContextStorage
delegates to OpenTelemetry's defaultThreadLocalContextStorage
when called on a thread where there's no Vert.x context. This would enable OpenTelemetry tracing also on non-Vert.x threads while not violating any Vert.x guarantees. What do you think?If you agree, I am open to submit a PR.