smallrye / smallrye-reactive-messaging

SmallRye Reactive Messaging
http://www.smallrye.io/smallrye-reactive-messaging/
Apache License 2.0
242 stars 179 forks source link

Attach exceptions to spans in error handlers for better Observability #2782

Open michalcukierman opened 1 month ago

michalcukierman commented 1 month ago

It appears that attaching exceptions to spans in error handlers could be implemented relatively easily, improving the observability of failures.

The approach taken in the Quarkus handler (AttachExceptionHandler) uses the following line to record exceptions:

LocalRootSpan.current().recordException(throwable);

We could adopt a similar approach in PulsarIncomingChannel to enhance traceability. For example, modifying the method here: smallrye-reactive-messaging-pulsar/PulsarIncomingChannel.java#L199:

public synchronized void reportFailure(Throwable failure, boolean fatal) {
    // Don't keep all the failures, only keep them for reporting.
    if (failures.size() == 10) {
        failures.remove(0);
    }
    failures.add(failure);

    // Attach the exception to the current span for observability
    LocalRootSpan.current().recordException(failure);

    if (fatal) {
        close();
    }
}

Adding this simple modification would provide a more comprehensive view of exceptions for observability tools like OpenTelemetry, Jaeger, or similar, enabling better tracking and debugging of issues across the system.

I can create the PR, if you think that the investigation is correct.

ozangunalp commented 2 weeks ago

Hi @michalcukierman, I've had this at the back of my mind, and had time to have a quick look.

For incoming messages this is not applicable. Because before the consuming method is called, "consume" span has already been closed. And for deserialization errors, the span has not been created yet.

As for processing errors, if the method is annotated with @WithSpan, this is already captured and span is finished with an exception.

For outgoing messages, maybe we can do something. I think we end the span without waiting for the message acknowledgement. We may wait for the message delivery ack to end the span.