open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.96k stars 857 forks source link

Support decide shouldSample on process end。 #11922

Open hmleo opened 3 months ago

hmleo commented 3 months ago

Is your feature request related to a problem? Please describe.

In the current header sampling mechanism, the samplingResult is depend on Instrumenter.start(), such as parent or traceIdRatio.

When the samplingResult create ends, there may be an invocation exception in instrument target method, and exception is usually the information we focus on. We want to be able to find this kind of focused span during the span process and then decide whether to sample it or not.

Of course, we can also realize our demands in the tail sampling mode, but it must be 100% sampled in the head, which will cause an increase in resource consumption to a certain extent.

So, can it support decide sample on proccess end ?

Describe the solution you'd like

support decide sample on proccess end

Describe alternatives you've considered

No response

Additional context

No response

steverao commented 3 months ago

It sounds reasonable! Besides some traces that contain errors or exceptions, it's valuable for users to collect some slow traces. @open-telemetry/java-instrumentation-approvers WDYT?

laurit commented 3 months ago

I'm not completely sure what the ask here is. @steverao if you understood feel free to elaborate.

steverao commented 3 months ago

I'm not completely sure what the ask here is. @steverao if you understood feel free to elaborate.

From my understanding, supported samplers don't support to collect all spans with exceptions or errors according to https://opentelemetry.io/docs/languages/java/sampling/

there may be an invocation exception in instrument target method, and exception is usually the information we focus on. We want to be able to find this kind of focused span during the span process and then decide whether to sample it or not.

But he seems want to achieve sample a span according to its execution result, for example, when a request execute failed then we should sample its corresponding span and ignore its initiated sampling decision. What I describe is correct or not? @hmleo

hmleo commented 3 months ago

e failed then we should sample its corresponding span and ignore its initiated sampling decision. What I describe is correct

correctly, if use 100% sampling, most span is not important。 so if can support descide sample in span lifecycle (such as span create / set attributes / end ), i think it will be greate。 thanks for your consider @steverao

laurit commented 3 months ago

@hmleo Do I understand correctly that your ask is for the sampler to support tail sampling so that you could sample based on whether there were any errors?

hmleo commented 3 months ago

@hmleo Do I understand correctly that your ask is for the sampler to support tail sampling so that you could sample based on whether there were any errors?

@laurit emm, in my understand the tail sampling is process in collector。 I want to deside sample in agent side。 Thanks for reply!

laurit commented 3 months ago

@laurit emm, in my understand the tail sampling is process in collector。 I want to deside sample in agent side。 Thanks for reply!

Head sampling means that sampling is done when the span is started. This means that head sampling can not use information that becomes available only after the operation completes, like whether the operation was successful or not. Tail sampling means that sampling is done after the trace is complete. Tail sampling can use all the information that is in the trace, like whether there were errors or not. Agent uses opentelemetry java sdk that implements head sampling as described in opentelemetry specification. Changes to how sampling works may need to be decided in the specification. Tail sampling inside the agent would be problematic because the agent know only the part of the trace that it created, it does not know about the spans created by the services it calls so it would not be able to sample those.

hmleo commented 3 months ago

@laurit emm, in my understand the tail sampling is process in collector。 I want to deside sample in agent side。 Thanks for reply!

Head sampling means that sampling is done when the span is started. This means that head sampling can not use information that becomes available only after the operation completes, like whether the operation was successful or not. Tail sampling means that sampling is done after the trace is complete. Tail sampling can use all the information that is in the trace, like whether there were errors or not. Agent uses opentelemetry java sdk that implements head sampling as described in opentelemetry specification. Changes to how sampling works may need to be decided in the specification. Tail sampling inside the agent would be problematic because the agent know only the part of the trace that it created, it does not know about the spans created by the services it calls so it would not be able to sample those.

thanks, I understand what do you mean. there are two scenes: 1.tail sampling got 100% spans in one trace. but if we want sample all error spans, it require 100% sample. 2.support decide samle on span lifecycle in agent. Do not require 100% sample, it may lose some other service's spans in whole trace,may got only 50%spans in one trace, but the 50% is important. I think it's acceptable

yes, as you say, it's decide in opentelemetry specification. I may submit a issue in the project