open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.73k stars 888 forks source link

Clarification on how processors handle sampling decisions. #3779

Open HaloFour opened 10 months ago

HaloFour commented 10 months ago

What are you trying to achieve?

I would like to clarify the following statement in the Trace SDK specification:

Span Exporters MUST receive those spans which have Sampled flag set to true and they SHOULD NOT receive the ones that do not.

Particularly around the conversation on this PR here: https://github.com/open-telemetry/opentelemetry-specification/pull/871#discussion_r476539875

The implication is that it's not forbidden to have a mechanism, even for the built-in processors, to allow unsampled spans to be forwarded to an exporter, as long as it's explicitly an opt-in mechanism. I would like to request such a mechanism be added to the BatchSpanProcessor in the Java SDK support a mechanism to allow unsampled spans to be batched and forwarded to the exporter.

Additional context.

In our case we have a number of services operated by different teams with their own observability stacks. We would like to have the ability to export a substantially higher percentage of spans to our stack without imposing that sampling percentage to our upstream dependencies.

tigrannajaryan commented 10 months ago

From what I understand this asks for the spec to specify that the SDK is allowed to have an particular opti-in capability (to export non-sampled spans). I don't think we want to have that type of language in the spec. Extra opti-in functionality does not necessarily have to be explicitly permitted in the spec. If it is not prohibited in the spec then it is permitted implicitly.

@jmacd had thoughts about making the 4th type of decision (non-sampled/exported) a feature. If that happens in the future then it certainly will become part of the spec as a requirement.

jmacd commented 10 months ago

I believe this is a duplicate of https://github.com/open-telemetry/opentelemetry-specification/issues/2986.

Another open issue, https://github.com/open-telemetry/opentelemetry-specification/issues/2918, discusses this topic. See https://github.com/open-telemetry/opentelemetry-specification/issues/2918#issuecomment-1329472624.

Since I see this ultimately as a sampling-related problem, you are invited to join the weekly Sampling SIG, which is Thursdays at 4PM UTC.

HaloFour commented 10 months ago

I disagree that this is a duplicate. This isn't about changing or expanding upon sampling decisions, it's about allowing a processor to explicitly opt-in to ignoring the sampling decision, or derive it's own.

The spec, as is, appears to afford SDK implementations the wiggle room to do that already, but the Java SDK folks feel uncertain about adding such an option to BatchSpanProcessor given the current verbiage.

jmacd commented 10 months ago

@HaloFour Thank you. @tigrannajaryan tried to address your request about opt-in behaviors in the default SDKs, which is not explicitly disallowed.

2986 appears to be a solution-agnostic request for the same functionality, pointing out that we lack a way to export unsampled spans, so duplicate in nature if not phrased exactly the same.

However, I'm coming around to seeing the logic in your request.

Span Exporters MUST receive those spans which have Sampled flag set to true and they SHOULD NOT receive the ones that do not.

This specification is somewhat misplaced. The library guidelines indicate that exporters should do protocol, not logic. It should be the processor's decision to export or not export, and the exporter should not question it.

The batch span processor -- it SHOULD export spans that are sampled and it SHOULD NOT export spans that are not sampled. However, these are "SHOULD" statements and then, I think, you will have more freedom to experiment with opt-in processor behavior. Does that sound good?

HaloFour commented 10 months ago

/cc @jack-berg @jkwatson

Would that change allay concerns about allowing the SDK span processors support an opt-in mechanism to export un-sampled spans?

austinlparker commented 5 months ago

@jmacd please add this to the sampling project board when ready

trask commented 6 days ago

@HaloFour is this issue the same as (the Java implementation) https://github.com/open-telemetry/opentelemetry-java/pull/6057?

HaloFour commented 6 days ago

@trask Yes, they are related. I believe I opened it to clarify whether the spec did permit the SDK to offer that opt-in for processors to export non-sampled spans.