spring-projects / spring-security

Spring Security
http://spring.io/projects/spring-security
Apache License 2.0
8.82k stars 5.9k forks source link

"Span wasn't started - an observation must be started (not only created)" (Micrometer) due to observation handling in Spring Security Web? #14445

Closed janekbettinger closed 9 months ago

janekbettinger commented 10 months ago

Describe the bug In all of our Spring WebFlux-based applications having integrated Spring Security Web and Micrometer, we sporadically see the following exception:

java.lang.IllegalStateException: Span wasn't started - an observation must be started (not only created)
        at io.micrometer.tracing.handler.TracingObservationHandler.getRequiredSpan(TracingObservationHandler.java:209) ~[micrometer-tracing-1.2.1.jar:1.2.1]
        Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Error has been observed at the following site(s):
        *__checkpoint ⇢ ServerWebExchangeReactorContextWebFilter [DefaultWebFilterChain]
        *__checkpoint ⇢ org.springframework.security.web.server.WebFilterChainProxy [DefaultWebFilterChain]
        *__checkpoint ⇢ HTTP GET "/actuator/health" [ExceptionHandlingWebHandler]
Original Stack Trace:
                at io.micrometer.tracing.handler.TracingObservationHandler.getRequiredSpan(TracingObservationHandler.java:209) ~[micrometer-tracing-1.2.1.jar:1.2.1]
                at io.micrometer.tracing.handler.DefaultTracingObservationHandler.onStop(DefaultTracingObservationHandler.java:50) ~[micrometer-tracing-1.2.1.jar:1.2.1]
                at io.micrometer.observation.ObservationHandler$FirstMatchingCompositeObservationHandler.onStop(ObservationHandler.java:197) ~[micrometer-observation-1.12.1.jar:1.12.1]
                at io.micrometer.observation.SimpleObservation.lambda$notifyOnObservationStopped$0(SimpleObservation.java:268) ~[micrometer-observation-1.12.1.jar:1.12.1]
                at java.base/java.util.ArrayDeque$DescendingIterator.forEachRemaining(ArrayDeque.java:771) ~[na:na]
                at io.micrometer.observation.SimpleObservation.notifyOnObservationStopped(SimpleObservation.java:268) ~[micrometer-observation-1.12.1.jar:1.12.1]
                at io.micrometer.observation.SimpleObservation.stop(SimpleObservation.java:188) ~[micrometer-observation-1.12.1.jar:1.12.1]
                at org.springframework.security.web.server.ObservationWebFilterChainDecorator$PhasedObservation.close(ObservationWebFilterChainDecorator.java:763) ~[spring-security-web-6.2.1.jar:6.2.1]
                at org.springframework.security.web.server.ObservationWebFilterChainDecorator$AroundWebFilterObservation$SimpleAroundWebFilterObservation.close(ObservationWebFilterChainDecorator.java:301) ~[spring-security-web-6.2.1.jar:6.2.1]
                at org.springframework.security.web.server.ObservationWebFilterChainDecorator$AroundWebFilterObservation$SimpleAroundWebFilterObservation.lambda$wrap$3(ObservationWebFilterChainDecorator.java:366) ~[spring-security-web-6.2.1.jar:6.2.1]
                at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onComplete(MonoPeekTerminal.java:289) ~[reactor-core-3.6.1.jar:3.6.1]
                ...
                at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:252) ~[reactor-core-3.6.1.jar:3.6.1]
                at io.micrometer.context.ContextSnapshot.lambda$wrap$0(ContextSnapshot.java:91) ~[context-propagation-1.1.0.jar:1.1.0]
                ...
                at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]

(full stacktrace)

It mainly happens on /actuator/health calls, but not exclusively. E.g. with Spring Cloud Gateway we randomly see this exceptions on non-routable routes as well. Closed connections can provoke the behavior as well.

The issue might be caused by the Observation handling by Spring Security Web in o.s.security.web.server.ObservationWebFilterChainDecorator. Replacing start() with close() in the onSuccess handler of the WebFilterChain wrapper seems to prevent the failure.

Related issues

To Reproduce Follow the README.md in the minimal, reproducible sample linked below.

Expected behavior Only started observations should be stopped.

Sample https://github.com/janekbettinger/micrometer-spring-security-sample

Yocker95k commented 10 months ago

We have the same issue which is really annoying since it spams our monitoring and alerting system in Prod.

janekbettinger commented 9 months ago

@jzheaux might be able to support here. Thanks in advance!

jzheaux commented 9 months ago

Thanks, @janekbettinger, for the reproducer. Please try with the latest fix, which is available in the 6.1.7-SNAPSHOT, 6.2.2-SNAPSHOT, and 6.3.0-SNAPSHOT releases.

janekbettinger commented 9 months ago

Thanks for the fix, @jzheaux, it looks very reasonable! With the sample application and 6.2.2-SNAPSHOT I can no longer reproduce the bug. I'll let you know if we ever see it in our production apps.

sfotedar995 commented 8 months ago

Hi @jzheaux ,

We were facing the issue: IllegalStateException: Span wasn't started - an observation must be started (not only created)

Currently, the spring boot version we are using is 3.1.2.

Could you confirm in which version of spring boot this issue is fixed along with its micrometer version?

janekbettinger commented 8 months ago

Hi @sfotedar995, Spring Security 6.2.2 which includes the fix is shipped with Spring Boot (security starter) 3.2.3.

sfotedar995 commented 8 months ago

Hi @janekbettinger not with spring boot version 3.2.1 ??

kzander91 commented 8 months ago

It's also included in Spring Security 6.1.7, which comes with Spring Boot 3.1.9, see https://github.com/spring-projects/spring-boot/releases/tag/v3.1.9.

detached commented 6 months ago

Still happening in our application with Spring-Boot 3.2.5 and Security 6.2.4