open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.82k stars 798 forks source link

Exclude URLs from Tracing #1060

Open jakob-o opened 3 years ago

jakob-o commented 3 years ago

Is your feature request related to a problem? Please describe. As already mentioned here https://github.com/open-telemetry/opentelemetry-specification/issues/173 I'd like to be able to exclude or sample a list of URLs / URL-Patterns from instrumentation. In my case particularly to avoid generating many events from health- and liveness-checks.

Describe the solution you'd like I opened the issue https://github.com/open-telemetry/opentelemetry-java/issues/1552 to discuss if / how tracing might be disabled from instrumentation. To my knowledge there currently is no API / SDK method to disable tracing centrally on the context. If I understood correctly a (maybe temporary) solution might be to create a non-recording / invalid span in the HTTP instrumentation, which due to the ParentOrElse-Sampler, would lead to ignoring child spans as well, if the request matches a URL pattern. Any hint to where this would be architecturally appropriately implemented is highly appreciated. Maybe in the HttpServerTracer?

Describe alternatives you've considered We already attempted to use the otel.trace.classes.exclude but only succeeded in completely disabling WebMvc instrumentation.

/CC @gabrielthunig @spaletta

iNikem commented 3 years ago

If I understood correctly a (maybe temporary) solution might be to create a non-recording / invalid span in the HTTP instrumentation, which due to the ParentOrElse-Sampler, would lead to ignoring child spans as well, if the request matches a URL pattern. Any hint to where this would be architecturally appropriately implemented is highly appreciated. Maybe in the HttpServerTracer

That seems like a reasonable approach :)

anuraaga commented 3 years ago

Been trying auto instrumentation in a container lately, and was slightly annoyed myself with tracing of health check. Then a customer trying it out independently had the same feedback - it's exactly the kind of input we're hoping for in trials :) So I may actually mark this required for GA, the UX is impacted a lot with having no control of tracing by URL pattern.

pavolloffay commented 3 years ago

I have started looking into this.

Is there a proposed design for this? In OpenTracing this was an instrumentation feature. The instrumentation check if the URL matches exclude pattern if yes then the span wasn't created. However if the excluded URL uses another instrumentation (or makes downstream call) that would create a span. The question is whether we want to exclude just specific URLs or the whole trace starting at that URL.

iNikem commented 3 years ago

Yes, we have a proposal right in the task's description:

If I understood correctly a (maybe temporary) solution might be to create a non-recording / invalid span in the HTTP instrumentation, which due to the ParentOrElse-Sampler, would lead to ignoring child spans as well, if the request matches a URL pattern. Any hint to where this would be architecturally appropriately implemented is highly appreciated. Maybe in the HttpServerTracer?

This should lead to ignoring the whole subtree starting from that SERVER span.

pavolloffay commented 3 years ago

+1 I think that is the right way to go https://github.com/open-telemetry/opentelemetry-specification/issues/173#issuecomment-698190021. It would also make sense to have a consolidated config property for this.

@iNikem what is the API to create non-recording span? In OT there was sampling.priority=bool tag that could be applied on the span builder https://github.com/opentracing/specification/blob/master/semantic_conventions.yaml#L26

iNikem commented 3 years ago

I think the right way is to use one of the factory methods on io.opentelemetry.trace.DefaultSpan.

pavolloffay commented 3 years ago

Yeah, I wanted to avoid having two paths of span creation.

anuraaga commented 3 years ago

Sorry if double-spam, I thought I had already posted this. How about we have a special Sampler itself configured that delegates to the default, except for when the path matches the allow list, then it's ParentOnly? It means we need to refactor to make sure our tracers set attributes on Span.Builder instead of Span - a little annoying but we should have been doing that already so maybe good motivation for it.

anuraaga commented 3 years ago

Note that DefaultSpan might get renamed to something that would be naming-wise not a good fit with what we want to do here https://github.com/open-telemetry/opentelemetry-specification/pull/994#pullrequestreview-495461768

iNikem commented 3 years ago

refactor to make sure our tracers set attributes on Span.Builder instead of Span

Yes, we want to do that eventually.

cemo commented 3 years ago

Is there a way now to exclude health check traces? I checked processors but could not find any solution too.

iNikem commented 3 years ago

No, this functionality is not yet implemented.

irl-segfault commented 2 years ago

bump. Is there any workaround here in the meantime? Sampling of health checks is not ideal.

iNikem commented 2 years ago

bump. Is there any workaround here in the meantime? Sampling of health checks is not ideal.

The only known workaround is to write custom sampler.

But I have plans to address this issue during the next month or so.

iNikem commented 2 years ago

The corresponding Sampler has been implemented in the contrib repo. It can be added to your deployment using extension mechanism. There is no immediate plans to add that sampler into this distribution, as this requires changes in Otel Specification and that requires some effort.

cemo commented 2 years ago

@iNikem what is the proposed way to configure these classes from environment? Is there any mechanism to bind jvm parameter to fields?

iNikem commented 2 years ago

@iNikem what is the proposed way to configure these classes from environment?

There is no such way. To use Sampler from the contrib repo you have to add it to your deployment via extension.

trask commented 2 years ago

@iNikem what is the proposed way to configure these classes from environment?

There is no such way. To use Sampler from the contrib repo you have to add it to your deployment via extension.

👍

@cemo The current configuration is limited to key/value pairs, which doesn't model the sampler rules well. In the future, we're hoping to have a richer configuration file: https://github.com/open-telemetry/opentelemetry-specification/issues/1773. Then we could add experimental configuration support for this. And eventually we're hoping for these sampler rules (or something like them) to be spec'd and supported cross-language.

gtuk commented 2 years ago

@iNikem @trask Could provide an example how to use the RuleBasedRoutingSampler exactly? What i did so far is:

But how do i pass the rules to the agent? In the DemoSampler https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/examples/extension/src/main/java/com/example/javaagent/DemoSampler.java it's more clear as the rule/condition is directly inside the class

trask commented 2 years ago

hi @gtuk! opentelemetry-contrib-samplers isn't published as an opentelemetry javaagent extension, so you'll need to build your own extension that consumes (and programmatically configures) the rules

RashmiRam commented 2 years ago

@trask The extensions are working out pretty neat 🎉 . Thanks for that. One question though, Sampler's shouldSampleonly works with the following data to set the sampling decision.

  1. parentContext
  2. traceId
  3. name
  4. spanKind
  5. attributes
  6. parentLinks If we want sampling decision to be based on either span's status or span's events, Is there any way to achieve it using Extension sampler?
trask commented 2 years ago

hey @RashmiRam! check out #3907, and there was recent discussion also about this: https://github.com/open-telemetry/opentelemetry-java/issues/3963#issuecomment-988460277

RashmiRam commented 2 years ago

Thanks @trask. That was really helpful. Yes. https://github.com/open-telemetry/opentelemetry-java/issues/3963#issuecomment-988460277 is also my exact use case. Wanted to achieve the same thing to push all errors irrespective of the sampling decision. Do you think it is a valid ask to have configuration in SDK to push all errors irrespective of sampling?

trask commented 2 years ago

Do you think it is a valid ask to have configuration in SDK to push all errors irrespective of sampling?

the problem is that when the sampler decides not to sample a span (which happens at the start of a span), the SDK doesn't capture any telemetry on that span. this makes unsampled spans very efficient, but it makes impossible to change your mind at the end of an unsampled span.

welshm commented 2 years ago

Maybe there's a better place to ask for help, but I'm trying to resolve this problem with the extension workaround. When attempting to run the extension example here I get an error because I'm trying to build for Java8 as opposed to Java11 - is that a strict requirement for using extensions?

I tried to downgrade some gradle dependencies, but I'm not that familiar with Gradle myself (mostly use Maven). If there's a better place to ask questions or get help, please let me know!

trask commented 2 years ago

hi @welshm! try building with Java 11, it should still produce Java 8 compatible code

xpicio commented 2 years ago

hello, is there any news about this feature ?

mnadeem commented 1 year ago

Any update on this ?

Montyroi commented 1 year ago

Can anyone provide an example code in github link on how to exlude urls(exlude health check urls)

mateuszrzeszutek commented 1 year ago

Hey @Montyroi , This feature has not been implemented in the "core" javaagent yet. However, you can write a sampler yourself and package it in an extension. You can use the rule based sampler from the contrib repo that Nikita mentioned a couple posts earlier:

The corresponding Sampler has been implemented in the contrib repo. It can be added to your deployment using extension mechanism. There is no immediate plans to add that sampler into this distribution, as this requires changes in Otel Specification and that requires some effort.

Or implement one from scratch, e.g. https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/examples/extension/src/main/java/com/example/javaagent/DemoSampler.java

surajsjarali commented 1 year ago

Hey @mateuszrzeszutek @trask @iNikem I am trying to exclude health check endpoint, even though spankKind Is INTERNAL and parentContext is containing /actuator/health, I am able to see the health check traces. How to exclude/drop the health check traces?

if (spanKind == SpanKind.INTERNAL && parentContext.toString().contains("/actuator/health")) {
      return SamplingResult.create(SamplingDecision.DROP);
    }

Command used to run it

java -javaagent:/Users/Downloads/jaegar/build/otel/opentelemetry-javaagent.jar "-Dotel.resource.attributes=service.name=test3" "-Dotel.traces.exporter=jaeger" "-Dotel.javaagent.extensions=/Users/Documents/opentelemetry-java-instrumentation/examples/extension/build/libs/opentelemetry-java-instrumentation-extension-demo-1.0-all.jar" "-Dotel.exporter.otlp.traces.endpoint=http://localhost:14250" -jar /Users/Downloads/jaegarr/build/libs/jaegarr1-0.0.1-SNAPSHOT.jar

surajsjarali commented 1 year ago

Hey @Montyroi , This feature has not been implemented in the "core" javaagent yet. However, you can write a sampler yourself and package it in an extension. You can use the rule based sampler from the contrib repo that Nikita mentioned a couple posts earlier:

The corresponding Sampler has been implemented in the contrib repo. It can be added to your deployment using extension mechanism. There is no immediate plans to add that sampler into this distribution, as this requires changes in Otel Specification and that requires some effort.

Or implement one from scratch, e.g. https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/examples/extension/src/main/java/com/example/javaagent/DemoSampler.java

@mateuszrzeszutek how do we configure the list of rules for rule based sampling? could you provide a example.

jack-berg commented 1 year ago

This is has come up a number of times for our customers as well, so I've updated our otel java agent example to package an extension which uses the rule based to drop requests to spring boot actuator endpoints. Maybe seeing concrete sample code will clear up any confusion.

roadSurfer commented 1 year ago

Thank you so much for this, @jack-berg. I got my own version of it running now. For anyone else using Maven, seems you need to use the following in order to create an extension fat-jar with all the runtime dependecies:

<plugin>
    <artifactId>maven-assembly-plugin</artifactId>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>single</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <appendAssemblyId>false</appendAssemblyId>
        <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
        </descriptorRefs>
    </configuration>
</plugin>

I was unable to get it to run sucessfully when trying to add to the classpath normally.

hadson172 commented 1 year ago

+1 any updates on the topic ? I guess it is impossible when using javaagent.jar?

trask commented 1 year ago

hi @hadson172! @jack-berg's example above is using the javaagent.jar

it's definitely not as simple as we would like it to be. at some point we would like to add configuration-based health check exclusions

degodev commented 1 year ago

+1 I would love to use the agent inside aws. Since the pricing is trace amount based, it is crucial to exclude actuator Endpoints.

pingping95 commented 1 year ago

I have no choice but to use it until the exclusion tracing function is released.

trask commented 1 year ago

for anyone who would like to work on adding this functionality to the base otel javaagent, here's a very high-level outline:

use a yaml configuration file to dynamically configure the RuleBasedRoutingSampler during startup of the javaagent

use a system property, e.g. -Dotel.sampling.rules.config=..., that users can use to specify the location of the yaml configuration file (later we will have a single yaml configuration file and we can consolidate, see recently merged configuration otep)

check out the jmx-metrics and metric view yaml configuration for some inspiration

vmaleze commented 1 year ago

Based on @jack-berg examples, and the possibility to embed extension directly into the agent, I've created this simple project that provides both the java agent and the docker auto instrumentation that will ignore /health and /metrics calls.

turesheim commented 11 months ago

We have similar requirements and wrote an extension that allows you to dynamically configure what sampler to use and add filtering for span creation. Configuration can be done by simply changing the configuration file while the application is running, or also by using the optional REST service for configuring multiple agents. The service also exposes Prometheus compatible metrics. You might find it useful. See https://github.com/domstolene/da-otel-agent for details.

szilaszi commented 10 months ago

I did a work-around for this having a look at how the OpenTelemetryAutoConfiguration class builds the SdkTracerProvider where the sampler is being set for the bean in Spring.

I duplicated the code in my bean configuration after I have seen the original bean is conditional and created a custom a sampler, which I used to drop any health check related traces.

The caveat is that the actual health check traces are coming with no names to the sampler and denying all spans with no names may drop some unintended traces... For now we are just only doing a PoC for a product, so take this as it is :)

Here is the sample for this:

    @Bean
    SdkTracerProvider otelSdkTracerProvider(Environment environment, ObjectProvider<SpanProcessor> spanProcessors,
                                            Sampler sampler, ObjectProvider<SdkTracerProviderBuilderCustomizer> customizers) {
        String applicationName = environment.getProperty("spring.application.name", "application");
        SdkTracerProviderBuilder builder = SdkTracerProvider.builder()
                .setSampler(new HealthCheckExclusionSampler())
                .setResource(Resource.create(Attributes.of(ResourceAttributes.SERVICE_NAME, applicationName)));
        spanProcessors.orderedStream().forEach(builder::addSpanProcessor);
        customizers.orderedStream().forEach((customizer) -> customizer.customize(builder));
        return builder.build();
    }

    /**
     * A sampler implementation that excludes certain health check spans from being sampled.
     * Note: I ignore all spans without a name, reason being that the actuator health checks apparently have no names.
     */
    static class HealthCheckExclusionSampler implements Sampler {

        @Override
        public SamplingResult shouldSample(Context parentContext, String traceId, String name, SpanKind spanKind, Attributes attributes, List<LinkData> parentLinks) {
            if (name.contains("/actuator/health") || name.contains("grpc.health.v1.Health/Check") || name.contains("<unspecified span name>")) {
                return SamplingResult.drop();
            }
            return SamplingResult.recordAndSample();
        }

        @Override
        public String getDescription() {
            return "HealthCheckExclusionSampler";
        }
    }

}
scprek commented 10 months ago

We have some services using Micronaut and it has a simple way of excluding HTTP routes https://micronaut-projects.github.io/micronaut-tracing/latest/guide/#http but we have other tech stacks too to deal with.

Also noticed those links in the contr project were broken. Think this is the new place.

https://github.com/open-telemetry/opentelemetry-java-contrib/tree/main/samplers/src/main/java/io/opentelemetry/contrib/sampler

azunna1 commented 8 months ago

If you're using a central collector to send traces to your backends, you can filter out those spans - https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/filterprocessor/README.md

trask commented 8 months ago

linking related: https://github.com/open-telemetry/oteps/pull/240

shameekagarwal commented 8 months ago

my architecture - app -> otel agent (auto instrumentation) -> otel collector -> observability backend for now, i wanted to filter out swagger and health check calls (caused due to k8s readiness probe)

solution 1 - filter processor. issue - resulted in orphaned spans
solution 2 - tail sampling processor. till now, looks like it works
nedcerneckis commented 6 months ago

Is there any update on this?

I understand the concerns raised by @iNikem about the specification being hard to change but this feature has been requested by numerous people for quite some time now.

I think this is a very important and much-needed feature for a lot of users of OpenTelemetry.

@trask Has anyone tried to contribute to the project with this feature? I would like to take a go at implementing it for the Otel Java agent.

trask commented 6 months ago

hi @nedcerneckis!

I understand the concerns raised by @iNikem about the specification being hard to change but this feature has been requested by numerous people for quite some time now.

I don't believe we're blocked by specification work, since we can add this initially as an opt-in experimental feature (bypassing the need for a specification)

Has anyone tried to contribute to the project with this feature? I would like to take a go at implementing it for the Otel Java agent.

not yet, that would be great, check out https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/1060#issuecomment-1494948191 for very high-level sketch that could fit well with existing feature set

kenfinnigan commented 6 months ago

For Lumigo's distribution I implemented a SamplerCustomizer for use with AutoConfigure, see here.

My approach was to allow urls for client and server to be filter, or separate env vars to filter urls for only client or server spans. The urls are defined as an array of regex, such as [".*/health.*", ".*/actuator.*"]

nedcerneckis commented 6 months ago

Thank you very much @trask! Much appreciated for the info.

I'm assuming this needs to be a little more advanced than just excluding certain HTTP endpoints and include other types of rules inside this YAML config file?