spring-projects / spring-boot

Spring Boot
https://spring.io/projects/spring-boot
Apache License 2.0
74.72k stars 40.58k forks source link

Add a property to disable HTTP Observations for actuator endpoints #34801

Open jonatan-ivanov opened 1 year ago

jonatan-ivanov commented 1 year ago

It seems there is a good amount of users who are registering an ObservationPredicate in order to disable Observations for /actuator/** endpoints, something like this:

@Bean
ObservationPredicate actuatorServerContextPredicate() {
    return (name, context) -> {
        if (name.equals("http.server.requests") && context instanceof ServerRequestObservationContext serverContext) {
            return !serverContext.getCarrier().getRequestURI().startsWith("<actuator-base-path>");
        }
        return true;
    };
}

I think this common use-case could be simplified by creating a similar bean (for mvc and webflux) and let the users to configure this using a single property.

jonatan-ivanov commented 1 year ago

Connected issue in Framework: https://github.com/spring-projects/spring-framework/issues/29210

Saljack commented 1 year ago

I implemented this solution in our services and it makes things worse than better because there is also Spring Security Observation. The security observation is not affected (because) this predicate disables only http.server.requests. This also creates multiple separated traces because they are created out of any parent trace id (http.server.requests trace). So then there are for example these traces:

643d472f5d60ba294b18360a79427310 - api-gateway: security filterchain after
643d472b0d5bde11b0fc3b53edc9ed36 - api-gateway: authorize -security-context-server-web-exchange
643d472646d2ef6b55c842b12e41d502 - api-gateway: security filterchain before

Of course I can disable Spring Security Observation but it does not make sense because then there can be for example Spring Data Observation.

jonatan-ivanov commented 1 year ago

There are a few connected issues to this, let me try to collect them here:

What you described as an issue can be a valid use-case for other users (ignoring parent observation but keeping children) but there is a solution/workaround to disable every Observation that was triggered through a specific HTTP endpoint (e.g.: /actuator) in https://github.com/spring-projects/spring-security/issues/12854, please see https://github.com/spring-projects/spring-security/issues/12854#issuecomment-1490838037 and https://github.com/spring-projects/spring-security/issues/12854#issuecomment-1491992099

Saljack commented 1 year ago

Ohh thanks. It is a nice summary. I was totally blind 🀦. Now it makes more sense to me. It is pretty hard to follow everything regarding observation because you need to follow/search multiple projects. I believe that most of these issues will be resolved in the Spring Boot 3.1. I really appreciate your work πŸ‘.

denniseffing commented 1 year ago

@jonatan-ivanov Do you have working workarounds to disable every Observation that was triggered through a specific HTTP endpoint for WebFlux as well? ☺️

jonatan-ivanov commented 1 year ago

I might have something that could work for you but I have never tried with webflux and I did not test it a lot for webmvc, I threw it out because of the RequestContextHolder solution above seemed much better:

I think you might be able to write a getRoot method that recursively can go upwards on the parent chain (context.parent) and find the one that has no (null) parent. At that point, you get the root context/observation, something like this:

private Observation.Context getRoot(Observation.Context current) {
    ObservationView parent = current.getParentObservation();
    if (parent == null) {
        return current;
    }
    else {
        return getRoot((Context) parent.getContextView());
    }
}

And if you got the root, you can check if it is an "HTTP context" and the path so you can ignore actuator, something like this:

@Bean
ObservationPredicate noActuatorServerObservations() {
    return (name, context) ->
        Context root = getRoot(context);
        if (root instanceof ServerRequestObservationContext serverContext) {
            return !serverContext.getCarrier().getRequestURI().startsWith("/actuator");
        }
        else {
            return true;
        }
    };
}

I guess another workaround could be getting the current request from Reactor somehow, maybe @chemicL or @bclozel could help in that.

chemicL commented 1 year ago

From Reactor's perspective, I think not much can be done, aside from clearing the ObservationThreadLocalAccessor.KEY from the Reactor Context, but that doesn't disable the Observation that is downstream in the operator chain: if it's started, it is in effect, only the propagation would not reach some parts, but the framework codebase is where the creation and start process needs to be eliminated.

saugion commented 1 year ago

I may not be in the right place, but I think my issue is related to this. I migrated a project from spring boot 2.7.x to 3.1.0, and the application always failed to start due to this error

{"stack_trace":"java.lang.NullPointerException: Cannot invoke \"String.hashCode()\" because \"this.name\" is null
    tat io.micrometer.core.instrument.Meter$Id.hashCode(Meter.java:363)
    tat java.base/java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
    tat io.micrometer.core.instrument.MeterRegistry.getOrCreateMeter(MeterRegistry.java:601)
    tat io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:577)
    tat io.micrometer.core.instrument.MeterRegistry.timer(MeterRegistry.java:323)
    tat io.micrometer.core.instrument.Timer$Builder.register(Timer.java:444)
    tat io.micrometer.core.instrument.observation.DefaultMeterObservationHandler.onStop(DefaultMeterObservationHandler.java:66)
    tat io.micrometer.tracing.handler.TracingAwareMeterObservationHandler.onStop(TracingAwareMeterObservationHandler.java:84)
    tat io.micrometer.observation.ObservationHandler$FirstMatchingCompositeObservationHandler.onStop(ObservationHandler.java:197)
    tat io.micrometer.observation.SimpleObservation.lambda$notifyOnObservationStopped$0(SimpleObservation.java:286)
    tat java.base/java.util.ArrayDeque$DescendingIterator.forEachRemaining(ArrayDeque.java:772)
    tat io.micrometer.observation.SimpleObservation.notifyOnObservationStopped(SimpleObservation.java:286)
    tat io.micrometer.observation.SimpleObservation.stop(SimpleObservation.java:194)
    tat org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:123)
    tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)

I don't have any custom metric in the application, it is not due to any custom code.

I ended up adding this bean that solves the issue, but there should not be a null metric trying to be registered!

@Bean
    ObservationRegistryCustomizer<ObservationRegistry> noObservations() {
        ObservationPredicate predicate = (name, context) -> nonNull(name);
        return registry -> registry.observationConfig().observationPredicate(predicate);
    }
bclozel commented 1 year ago

@saulgiordani can you create a new issue in the Spring Framework project with a sample application reproducing the problem? Thanks!

saugion commented 1 year ago

@saulgiordani can you create a new issue in the Spring Framework project with a sample application reproducing the problem? Thanks!

Hi @bclozel, the issue was on my side. I implemented the interface ServerRequestObservationConvention without overriding the getName() method, this is why it was returning null

goafabric commented 11 months ago

@goafabric: remindme

codefromthecrypt commented 7 months ago

if it helps progress this, I would suggest peeling off the health endpoint as something easier to land maybe? Use in docker and k8s fills up trace repositories with traces named like http get /actuator/health/{*path}

This is a bad experience for those migrating from sleuth or anything else that skips health by default

codefromthecrypt commented 7 months ago

A workaround for docker is to configure the HEALTHCHECK to add the header 'b3: 0' (if using brave, as even when outbound propagation is set to b3 multi, inbound leniently accepts this).

k8s is more complicated because some define liveness and readiness in a deployment (e.g. helm) and that variant doesn't seem to accept http headers 😒

Note this won't work for w3c, as they intentionally decided to not define a way to say "never sample"

jonatan-ivanov commented 7 months ago

@codefromthecrypt You might be able to do this as a workaround:

@Bean
ObservationPredicate noActuatorServerObservations() {
    return (name, context) -> {
        if (name.equals("http.server.requests") && context instanceof ServerRequestObservationContext serverContext) {
            return !serverContext.getCarrier().getRequestURI().startsWith("/actuator");
        }
        else {
            return true;
        }
    };
}
codefromthecrypt commented 7 months ago

note @nidhi-nair seemed to need to look up a root context to implement this, though I don't know if it was project-specific https://github.com/appsmithorg/appsmith/pull/24758

codefromthecrypt commented 7 months ago

@jonatan-ivanov thanks I was able to get something similar working https://github.com/openzipkin/brave-example/pull/120

codefromthecrypt commented 7 months ago

one thing that became apparent in this exercise is that this is the perfect thing to normalize in a property at least until there's a generic HTTP path accessor. Depending on if you are reactive or not, you have to define this predicate with different types. I think that's not ideal especially for something that used to work via property.

jonatan-ivanov commented 7 months ago

note @nidhi-nair seemed to need to look up a root context to implement this, though I don't know if it was project-specific

Yes, you need to find the root in some use-cases. I would say it is use-case specific: if you have more than one spans and you don't have access to the original request (or something that the root has), you need to look the root up, here's another use case with some possible shortcuts in Spring Security: https://github.com/spring-projects/spring-security/issues/12854

@jonatan-ivanov thanks I was able to get something similar working https://github.com/openzipkin/brave-example/pull/120

πŸ‘πŸΌ

one thing that became apparent in this exercise is that this is the perfect thing to normalize in a property [...]

I agree, I also think that this could be more generic (not actuator-only) where the users are able to define the path prefix.

mfvanek commented 1 month ago

We run our Spring Boot apps on Kubernetes. We always move actuator to the separate port. We collect liveness probe from actuator port but readiness probe is collected from application main port due to reliability issues (escaping restarts of apps under overloading).

Our typical configuration looks like (see additional-path):

management:
  server:
    port: 8081
  endpoint:
    health:
      probes.enabled: true
      group:
        liveness:
          include: livenessState
          additional-path: server:/livez
        readiness:
          include: readinessState
          additional-path: server:/readyz

It would be nice to automatically disable observations for additional endpoints as well (according to additional-path).