jmacd commented 3 years ago

This OTEP specifies how to propagate head trace sampling probability, so that child spans can be counted on the fly. This specification would allow ParentBased samplers to output sampling probabilities in a way that supports span-to-metrics pipelines, for example.

This OTEP is paired with #170 which discusses additions to the span data model.

oertl commented 3 years ago

@jmacd

This makes sense. It makes me wonder, though, if we have a way to generate the correct randomness, what's really keeping us from using random bits in TraceID? If we have random bits in the TraceID, the number-of-leading-zeros test can be applied directly, right?

Yes, if the trace ID is specified to be truly random, it would be possible to use it without hashing.

...and I gather that the answer is -- it requires less randomness with your proposal. In the version where we add a single byte to convey randomness, the expected number of random bits required is 2 in this case (for the reader, I believe we're talking about the mean of a https://en.wikipedia.org/wiki/Geometric_distribution with p=0.5). Can you confirm?

Yes, the given pseudocode generates a geometrically distributed random number with p=0.5 and takes just 2 random bits on average.

What I was trying to say is that all samplers (AlwaysOn, AlwaysOff, Parent, and TraceIDRatio) can be implemented using the same logic as the TraceIDRatio sampler, provided that the sampling rate of the parent is propagated. They only differ how the sampling rate is chosen for each span. For AlwaysOn it is 1, for AlwaysOff it is 0, for Parent it is the parent sampling rate, and for TraceIDRatio it is the fixed sampling rate. One could therefore say that a sampler should only be responsible for the selection of the sampling rate. The sampling decision is then always made according to the trace ID ratio logic.

To better reflect that in the interface, I would even redefine the sampler interface which is currently a function mapping to a boolean indicating the sampling decision. If it is defined as a function that maps each individual span to the corresponding sampling rate, the sampler implementations would be very simple. For example, TraceIDRatio would just return its fixed sampling rate and Parent would return the sampling rate of its parent. The final sampling decision would be made outside of the sampler following a specified and fixed logic based on the shared random number (or trace ID) and the sampling rate only.

The proposed redefined sampler interface would have following benefits:

Any sampling decision will be consistent, because the same logic is used to get a sampling decision based on the sampling rate and the shared random number (or trace ID).
The sampling rate and therefore the adjusted count is known for all spans as the interface enforces returning a sampling rate for every span.
For estimation and further processing it is sufficient to know the sampling rate. It is not necessary to know which sampler was used to determine the sampling rate. What matters is just the sampling rate. This makes recording the sampler name (which would introduce additional overhead and which was mentioned here https://github.com/open-telemetry/oteps/pull/148#issuecomment-885323810) obsolete.
Furthermore, new samplers which, for example, choose sampling rates more dynamically than TraceIDRatio based on span attributes, are easy to implement and would not break estimation algorithms.

jmacd commented 3 years ago

@oertl Thank you. I definitely understand your proposal. You've described a simplification in the Sampler interface, which is an SDK specification topic that I was hoping to avoid. I agree with your summary, the overall Sampler outcome would be easier to reason about with the change you suggest, but I am not sure that legacy users of the API will agree. The amount of energy it will take to make this happen may exceed the benefit produced.

I am trying to keep these OTEPs separate:

170 is about the data model for Span (formerly known as #148)

168 is about uses of the W3C trace context (this PR)

Changing the Sampler API would be another issue entirely.

I will revise this PR based on the discussion in https://github.com/open-telemetry/oteps/pull/168#discussion_r677859626, which means combining the two pieces of additional information that go into the traceparent to both include the head probability and the randomly distributed value, as they are both apparently needed to complete a functional span-to-metrics pipeline.

yurishkuro commented 3 years ago

Personally I would prefer to separate the propagation proposal from the W3C format discussion. Whatever happens with this OTEP is not going to have any (formal) impact on changing the W3C spec, the discussion would need to happen there. On the other hand, the actual functionality is not dependent on changing the W3C format because it can be achieved by using tracestate, which would be completely in the hands of OTEL & if adopted & successful would serve as a strong motivation for upgrading the traceparent spec.

jmacd commented 3 years ago

@oertl and @yurishkuro

This makes recording the sampler name (which would introduce additional overhead and which was mentioned here -- #148) obsolete.

In my proposal, in the text of #170 (now), the sampler.name attribute is only a MUST when the adjusted count is unknown, which would only result from legacy trace contexts (if we succeed). I see no reason to record the sampler.name otherwise, especially if as you suggest the TraceIDRatio decision is used consistently.

jmacd commented 3 years ago

@yurishkuro Thank you for your guidance. I have revised this proposal only to use tracestate.

@oertl I have revised this to include your proposal with a syntax like:

tracestate: otelprob=PPRR

where PP is the base16 probability value and RR is the base16 randomness value. Please take another look.

jmacd commented 3 years ago

I have prototyped changes described in this OTEP here. I was able to make this change without modifying the OTel-Go Sampler API.

The prototype raises several issues around specifying how to handle errors, and I think we can specify the behavior of errors in such a way that consumers get an unambiguous signal about sampling.

The greatest question the prototype raises for me is how to handle zero probability. In the prototype, I've placed a limit on the range of the geometric random variable with (as in @oertl's paper) the value 0 representing a span that meets every sampling threshold, 1 representing a span that is sampled 50% of the time, and so on. I believe we should limit this value to a reasonable number--I used 64 as the maximum value of this number and I think it best if we let this maximum value represent zero probability. This makes 63 the greatest value, which corresponds with an adjusted count of 2^63. The question this raises for me is how to handle the boundary condition--in practice it probably does not matter, but how should a Sampler behave if it sees a string of 64 leading zeros? (Should it reject the trial and start again? I think this is correct.)

To summarize: random value 64 represents a probability of zero, not of 2^-64. @oertl what do you think?

oertl commented 3 years ago

@jmacd I would limit the smallest sampling with 1/2^62 as 2^62 is the largest power of two that can be represented with a signed 64-bit integer (important to represent the adjusted count in languages like Java which only have signed data types). As a consequence, the exponent of the sampling rate X could be encoded with just 6 bits. The value X=63 is still free and can be used to represent a sampling rate of zero.

The number of leading zeros of a 64bit random number gives values from 0,1,2,...,64. I would trim to values 0,1,2,...,62 such that 6 bits are also sufficient to encode the propagated random number R. Here are the possible values with corresponding probabilities

R=0 occurs with 1/2^1 probability and all spans are sampled with sampling rate >= 1/2^0
R=1 occurs with 1/2^2 probability and all spans are sampled with sampling rate >= 1/2^1
...
R=61 occurs with 1/2^62 probability and all spans are sampled with sampling rate >= 1/2^61
R=62 occurs with 1/2^62 probability and all spans are sampled with sampling rate >= 1/2^62

Hence, a span is sampled if X <= R. If X=63 sampling will never occur, because the maximum value of R is 62.

jmacd commented 3 years ago

See https://github.com/w3c/trace-context/issues/463

jmacd commented 3 years ago

I've updated this OTEP after all the feedback, mentioning the prototype and the discussion in this week's W3C meeting.

jmacd commented 3 years ago

While reviewing the Span protocol for OTEP 170 I realized that Span encodes its tracestate

For the proposed otel vendor key within tracestate here, I would specify to DROP that entry (or at least the r: and p: sub-fields) before serializing a Span as those values are not directly useful without knowing which Sampler was used.

jmacd commented 3 years ago

I have updated this PR to support a minimum sampling probability of 2^-61. This allows the Span field described in #170 to use six bits of information, not 7, to encode the additional Unknown probability value.

jmacd commented 3 years ago

Discussed in the Sampling SIG:

Do we want to make propagation of r and p values always on? (This costs ~30 bytes per context, in this proposal)
Can we make this opt-in, or opt-out?
Can we look forward to a future where traceparent version 1 supports this with ~3 bytes per context.

jmacd commented 3 years ago

I revised this OTEP with more content and more examples, following discussion that took place in https://github.com/open-telemetry/opentelemetry-specification/pull/1899.

No significant changes have been made, only correcting off-by-one errors now that the p value of 0 indicates Unknown probability.

I added specification for several corner cases, suggesting that p can be omitted when not sampled and that p should not be respected when r is not set, for example.

The final version of this text uses:

s where sampling probability is 2**s p where 0 is unknown, 63 is zero, and 1-62 are known non-zeros r in the range [0, 61]

and the sampled flag equivalent to p-1 < r+1 && p < 63. I think this definition makes the formulas most readable, since s has a natural form, p is the same as for the Span's log_head_adjusted_count (from OTEP 170), and r equals the number of leading zeros in a random bit string.

Please take a careful look @oertl, thanks!

jmacd commented 3 years ago

@open-telemetry/specs-trace-approvers please take a look. The feedback has been addressed and the corner cases are explicitly spelled out. Thank you @oertl and @reyang.

jmacd commented 3 years ago

@open-telemetry/specs-trace-approvers

This OTEP has received a lot of review and we are down to a number of questions that I feel should be resolved as we move this work into the specification--this PR and #170 could merge. The remaining questions:

Should the Sampler described in OTEPs #168 and #170 replace the currently underspecified TraceIDRatioBased sampler, or
Should the old TraceIDRatioBased sampler be deprecated or possibly kept under a new name
Should the propagation implied by OTEP #168 be opt-out or opt-in. If this is an option, what is the option named?

If you agree that these are the only remaining questions, please review and approve both OTEPs and we can move on to debating these topics.

jmacd commented 3 years ago

@open-telemetry/specs-trace-approvers as announced in the Monday maintainer's meeting, this will proceed next week unless objections are raised here. There was one lengthy review comment thread (https://github.com/open-telemetry/oteps/pull/168#discussion_r704955698) that I resolved by opening a PR to revise OTEP 170 with more detail on how to compose Samplers, please see #175.

bogdandrutu commented 3 years ago

@jmacd please fix the check-errors.

jmacd commented 3 years ago

One big question that I don't get the answer from this OTEP is if this is a required feature.

Discussed in "Default behavior". I'd prefer this were on-by-default, but with the present tracestate solution it's expensive. Maybe after there's a W3C traceparent, where this costs 1-3 bytes per context, this can be on by default. I'll accept off-by-default for now-- Lightstep's OTel launchers would turn this on by default. I imagine "yet another" environment variable to say whether the TraceIDRatioBased sampler generates r or not.

bogdandrutu commented 3 years ago

@jmacd where should we discuss/agree if this is or not the default behavior?

carlosalberto commented 3 years ago

@bogdandrutu Just FYI there's a SIG call on 8 minutes, although don't know who package are you today ;)

jmacd commented 3 years ago

default behaviors will be discussed in specs

Yes. FWIW, I've already changed this OTEP to state that the default will be opt-in. We're just going to debate the name and form of the option.

open-telemetry / oteps

Specify how to propagate consistent head sampling probability #168

170 is about the data model for Span (formerly known as #148)

168 is about uses of the W3C trace context (this PR)