open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.7k stars 887 forks source link

Decision on new encoding for sampling "selectivity" #3602

Closed jmacd closed 4 months ago

jmacd commented 1 year ago

Summary

The Sampling SIG has been working on a proposal to follow the W3C tracecontext group, which has added a flag to convey definite information about randomness in the TraceID.

In particular, we aim to address the TODO about the TraceIDRatioBased Sampler:

TODO: Add details about how the TraceIdRatioBased is implemented as a function of the TraceID. #1413

We are looking for community input on a choice which will impact implementation complexity for OpenTelemetry Samplers as well as human-interpretability of the raw data.

Note this proposal was co-authored by @kentquirk @oertl @PeterF778 and @jmacd.

ACTION ITEM: Please review and vote for your preferred encoding strategy in the comments below. OpenTelemetry Tracing SDK authors as well as OpenTelemetry Collector trace processors will be asked to implement the encoding and decoding strategies here in order to communicate about sampling selectivity, and we need your input!

Background

We propose to add information to the W3C Trace Context specification to allow consistent sampling decisions to be made across the entire lifetime of a trace.

The expectation is that trace IDs should contain at least 56 bits of randomness in a known portion of the ID. This value is known as r, and there is a bit in the trace header that indicates its presence.

In probabilistic sampling, the sampling decision is a binary choice to keep (store) or drop (discard) a trace. Because traces are composed of multiple spans, we want to be sure that the same decision is made for all elements in the trace. Therefore, we don’t make a truly random decision at each stage. We instead wish to use the randomness embedded in the trace ID so that all stages can make consistent decisions.

In order to make consistent decisions, we need to propagate not only the randomness (the r value), but also the sampling selectivity used. In other words, in a trace that travels between services A, B, and C, any decision made by B should use the same information as a decision made by A, and B could potentially modify the selectivity so that C could also make an appropriate decision.

As noted, the r value expresses a 56-bit random value that can be used as the source of randomness for a probabilistic sampling decision. The intent of this proposal is to express the sampling selectivity that was used to make the decision, and to do it in the trace state.

Sampling selectivity can be described in several different ways:

Minimum requirements

Given the sampling information on the trace state it MUST be specified for any possible representations on any platform, how this translates to the applied sampling threshold (the value that was used to compare against the random bits). Only this allows to reproduce the sampling decision together with the 56 random bits and gives 100% consistency.

Based on that, it can be derived which of the 2^56+1 sampling thresholds, that are meaningful when having 56 random bits, can be expressed by the sampling information on the trace state. The proposals should therefore be clear about which thresholds are actually supported. The set of supported thresholds also defines the set of possible sampling probabilities. The sampling probability is just the threshold multiplied by 2^(-56).

When picking one of the supported thresholds, there should be a lossless way to map it to the sampling information that is written to the trace state. Lossless in the sense, that the reverse mapping as described in 1. yields again exactly the chosen threshold. The mapping from thresholds to the sampling information is important for adaptive sampling, where the threshold is automatically chosen.

Objective

We would like to express this sampling probability/rate/threshold in a reasonably compact way in the trace state. We would like that expression to be easy and efficient to generate and parse in any programming language. Another requirement is that the used notation should be able to describe cases of non-probabilistic sampling (corresponding to the zero adjusted count or the old p=63 case). We have been referring to this value as t.

The sampling SIG has been discussing this issue for some time, and we have examined several proposals. Each proposal has its strengths and weaknesses and we are looking for input from the larger community.

Note that this document includes just a summary of the proposals, below; all of them have been specified in sufficient detail to resolve most implementation issues. We are hoping for input from the community to help make the big decision about the implementation direction.

Request for input

The major difference in these proposals that we wish to seek input on is whether it is more important to optimize for threshold calculation (option A) at the expense of human readability, or whether to choose one of the other options which are readable and accessible, but make threshold calculations harder to work with.

List of options

When we refer to Tmax, we mean 2^56 (0x100000000000000 or 72057594037927936)

Option A: Hex Threshold

Title A: Hex Threshold
Description The t value is expressed as a threshold value in the range [0, 2^56-1] using 14 hexadecimal digits; an absent t-value represents a threshold of 2^56 corresponding to 100% sampling probability
Examples Keep all: t is omitted
Keep 1 in 10: t=19999999999999
Keep 1 in 8: t=20000000000000
Keep half: t=80000000000000
Keep 2/3: t=aaaaaaaaaaaaaa
Keep 1 in 1000: t=004189374bc6a7
Mapping to the sampling threshold threshold = parseHexInt(t)
If t is absent the threshold is 2^56
Supported thresholds All 2^56 + 1 thresholds that are meaningful when using 56 random bits. Possible thresholds are {0, 1, 2, 3, …., 2^56-1, 2^56}.
Mapping a supported sampling threshold to t t = encodeHexInt(threshold)
If t is 2^56, corresponding to 100% sampling probability, the t-value is not set
Advantages Both t and r can be parsed as hex strings using the same parsing function and compared directly with no further processing. Simplest approach satisfying the minimum requirements above. t-value can be compared by humans to trace-ID to understand the sampling decision. No floating-point operations or complex parsing needed.
Disadvantages t-value cannot be directly read by humans as sampling probability.

Option A1: Hex Threshold with omission of trailing zeros

Title A1: Hex Threshold (omission of trailing zeros allowed)
Description The t value is expressed as a threshold value in the range [0, 2^56-1] using 14 hexadecimal digits; an absent t-value represents a threshold of 2^56 corresponding to 100% sampling probability, trailing 0s may be omitted for brevity.
Examples Keep all: t is omitted
Keep 1 in 10: t=19999999999999
Keep 1 in 8: t=2
Keep half: t=8
Keep 2/3: t=aaaaaaaaaaaaaa
Keep 1 in 1000: t= 004189374bc6a7
Mapping to the sampling threshold threshold = parseHexInt(t)
If t is absent the threshold is 2^56t is padded with zeros if it has less than 14 hex digits
Supported thresholds All 2^56 + 1 thresholds that are meaningful when using 56 random bits
Mapping a supported sampling threshold to t t = encodeHexInt(threshold)
If t is 2^56, corresponding to 100% sampling probability, the t-value is not settrailing zeros may be omitted
Advantages Both t and r can be parsed as hex strings and compared directly with no further processing. Simplest approach satisfying the minimum requirements above. t-value can be compared by humans to trace-ID to understand the sampling decision. No floating-point operations or complex parsing needed.Compact representation for certain probabilities, especially those that are powers of two
Disadvantages t-value cannot be directly read by humans as sampling probability. compact power-of-2 representations are almost misleading. Other common sample rates are not expressed in compact or obvious ways.

Option B: Integer Sampling Rate

Title B: Integer Sampling Rate
Description The t value is expressed as a positive integer representing the ratio between the number of items kept and the total number of items.
Examples Keep all: t=1
Keep 1 in 8: t=8
Keep 1 in 10: t=10
Keep half: t=2
Keep 2/3: not expressible in this format
Keep 1 in 1000: t=1000
Keep none: not expressible in this format
Mapping to the sampling threshold threshold = round(2^56 / parseDecimalInt(t))
Supported thresholds 2^56, 2^56/2, 2^56/3, 2^56/4, …. corresponding to sampling probabilities 1, ½, ⅓, ¼, …
Mapping a supported sampling threshold to t t = encodeDecimalInt(round(2^56 / threshold)), this reverse mapping can be lossless for integers up to 268M
Advantages Easy format for most common rates. Adjusted counts (extrapolation factors) are guaranteed to be integers.
Disadvantages There are many values it can’t express; particularly the desire to keep more than half but less than all of the data. Mappings require floating-point divisions.

Option C: Sampling probability

Title C: Sampling Probability
Description The t value is expressed as a positive decimal floating point value between 0 and 1, representing the probability of keeping a given event.
Examples Keep all: t=1
Keep 1 in 8: t=.125
Keep 1 in 10: t=.1
Keep half: t=.5
Keep 2/3 ieee precision: t=.6666666666667
Keep 2/3 precision 4: t=.6667
Keep 2/3 precision 2: t=.67
Keep 1 in 1000: t=.001
Mapping to the sampling threshold threshold = 2^56 * parseDecimalFloat(t)
Note rounding is performed in parseDecimalFloat.
Supported thresholds If t is a double-precision floating-point number, at least all thresholds with 3 least significant bits being equal to zero.
Mapping a supported sampling threshold to t t = encodeDecimalFloat(threshold * 2^(-56)), this reverse mapping can be lossless for all supported thresholds when using enough decimal digits
Advantages Easy format for humans to read.
Disadvantages Requires floating point math. Doesn’t express powers of 2 compactly.

Option C1: Sampling probability with hex floating point

Title C1: Sampling Probability with hex floating point
Description The t value is expressed as a positive decimal or hexadecimal floating point value between 0 and 1, representing the probability of keeping a given event. The encoding MAY use C99 and IEEE-754-2008 specified hex floating point as an exact representation.Note about precision: IEEE-754 double-wide floating point numbers carry 52 bits of significand, 4 bits fewer than trace randomness. Below, “ieee precision” refers to 53 bits of precision.
Examples Keep all: as in C
Keep 1 in 8: as in C or t=0x1p-3
Keep 1 in 10: as in C or 0x1.5p-3
Keep half: as in C or t=0x1p-1
Keep 2/3 ieee precision: as in C or 0x1.5555555555555p-1 
Keep 2/3 precision 4: as in C or 0x1.5555p-1 
Keep 2/3 precision 2: as in C or 0x1.55p-1
Keep 1 in 1000 ieee precision: as in C or 0x1.0624dd2f1a9fcp-10
Keep 1 in 1000 precision 4: as in C or 0x1.0625p-10
Keep 1 in 1000 precision 2: as in C or 0x1.06p-10
Mapping to the sampling threshold threshold = 2^56 * parseHexFloat(t)
Note there is no rounding performed.
Supported thresholds If t is a double-precision floating-point number, at least all thresholds with 3 least significant bits being equal to zero.
Mapping a supported sampling threshold to t t = encodeHexFloatFromThreshold(threshold), this method is exact and lossless using built-in libraries up to 52 bits of precision, and exact and lossless at 56 bits of precision using custom code.
Advantages Decimal format for humans to read (as in C), but hex format permits exact and lossless encoding of arbitrary thresholds up to 56 bits
Disadvantages Users require custom code, or floating point math and a library that supports hex representation of it. Since this was added in ISO-C99 (1999) and IEEE-754 (2008), it is relatively widespread.

Option C2: Sampling Probability with unnormalized hex floating point

Title C2: Sampling Probability with unnormalized hex floating point
Description In addition to C and C1, encoders are encouraged to use unnormalized hex floating point when they synthesize arbitrary probabilities, because it is lossless and can be easily read as a sampling threshold by humans.
Examples Keep all: as in C1
Keep 1 in 8: as in C1 or t=0x2p-04 (threshold = 0x20000000000000)
Keep 1 in 10: as in C1 or t=0x2ap-8 (threshold = 0x2a000000000000)
Keep half: as in C1 or t=0x8p-04 (threshold 0x80000000000000)
Keep 2/3 full precision: as in C1 or 0xaaaaaaaaaaaaaap-56 
Keep 2/3 ieee precision: as in C1 or 0xaaaaaaaaaaaaap-52
Keep 2/3 precision 4: as in C1 or 0xaaabp-16
Keep 2/3 precision 2: as in C1 or 0xabp-8
Keep 1 in 1000 ieee precision: as in C1 or 0x4189374bc6a7p-56
Keep 1 in 1000 precision 4: as in C1 or 0x4189p-16
Keep 1 in 1000 precision 2: as in C1 or 0x42p-8
Mapping to the sampling threshold threshold = 2^56 * parseHexFloat(t)
Note there is no rounding performed.
Supported thresholds If t is a double-precision floating-point number, at least all thresholds with 3 least significant bits being equal to zero.
Mapping a supported sampling threshold to t t = encodeHexFloatFromThreshold(threshold), this method is exact and lossless using built-in libraries up to 52 bits of precision, and exact and lossless at 56 bits of precision using custom code.
Advantages Decimal format for humans to read (as in C), but hex format permits exact and lossless encoding of arbitrary thresholds up to 56 bits
Disadvantages Users require custom code, or floating point math and a library that supports hex representation of it. Since this was added in ISO-C99 (1999) and IEEE-754 (2008), it is relatively widespread.

Option D: Combination of C and D

Title D: Combine B and C21
Description The t value can be either a value <1, in which case it’s interpreted as in C (or C2 preferably or C1) above. Or it can be >=1, in which case it’s interpreted as in B. Implementations are recommended to limit precision to keep the encoding of sampling probabilities compact.
Examples Keep all: t=1
Keep 1 in 8: t=8 or t=.125
Keep 1 in 10: t=.1 or t=10
Keep half: t=.5 or t=2
Keep 2/3: t=.6667
Keep 1 in 1000: t=.001 or t=1000
Keep arbitrary hex-digit threshold HHHH with custom code: t=0xHHHHp-(len(HHHH)*4)
Keep arbitrary hex-digit threshold HHHH with standard library:t=0x1.JJJJJp-DDwhere JJJJJ and DD correspond with the normalized hex floating point value value corresponding with HHHH. Note that JJJJJ is one digit longer than HHHH, due to shifting hex digits by one bit.
Mapping to the sampling threshold See B and C
Supported thresholds The union of B and C. Hence, if t is a double-precision floating-point number, at least all thresholds with 3 least significant bits being equal to zero.
Mapping a supported sampling threshold to t Dependent on whether the threshold is supported by B or C, one has to choose the reverse mapping of B or C, respectively.
Advantages The most convenient representation can be used. This allows constant user-input probabilities to be represented exactly in their original human-readable form, while it allows machine-generated probabilities to be rounded and losslessly encoded with variable precision.
Disadvantages Requires floating point math or custom code. Parsing is slightly more complex.

Option E: Ratio

Title E: Ratio
Description The t value is expressed as a ratio of two integers n and d, separated by a slash. If n is 1 it may be omitted.
Examples Keep all: t=1
Keep 1 in 8: t=8 or t=1/8
Keep 1 in 10: t=10 or t=1/10
Keep half: t=50/100 or t=1/2 or t=2
Keep 2/3: t=2/3 or t=6667/10000
Keep 1 in 1000: t=1/1000 or t=1000
Mapping to the sampling threshold threshold = parseInt(t_numerator) * 2^56 / parseInt(t_denominator)(this formula may overflow when using 64-bit integers)
Supported thresholds The set of supported thresholds cannot be easily described and depends on the value range of the numerator and the denominator.
Mapping a supported sampling threshold to t Not unique and difficult as it relates to finding the closest ratio to the value given by threshold * 2^(-56)
Advantages Does not require floating point or complex parsing but can express a full range of sample rates.
Disadvantages Converting from a probability to a ratio may lose precision.

Option F: Powers-of-two

Title F:Powers of 2
Description The t value is expressed as the exponent of the sampling probability given by 0.5^t, see current proposal https://opentelemetry.io/docs/specs/otel/trace/tracestate-probability-sampling/#consistent-probability-sampling
Examples Keep all: t=0
Keep 1 in 8: t=3
Keep half: t=1
Mapping to the sampling threshold threshold = 2^(56 - parseInt(t))
Supported thresholds 2^56, 2^55, 2^54,…., 1 corresponding to sampling probabilities 1,½, ¼,...,2^(-56)..
Mapping a supported sampling threshold to t t = encodeInt(numberOfLeadingZeros(threshold)-7), provided that the threshold is a 64-bit integer numberOfLeadingZeros requires a single CPU instruction
Advantages Simple and compact. Adjusted counts (extrapolation factors) are guaranteed to be integers. Fast mapping to thresholds and vice versa.
Disadvantages Only supports power of 2 sampling probabilities.

dyladan commented 1 year ago

Seems to me that this comes down to a couple questions:

  1. Is human readability a requirement
  2. Do we want to prioritize base 2 or base 10 sample rates?
  3. Is it acceptable to have large gaps in possible sample rates? For example, a gap between 1 and 1/2 in solution F.
  4. How important is parsing and processing efficiency?

I would argue compactness of representation is the most important attribute, followed by parsing/processing efficiency, followed by human readability. To achieve those goals, I would support a base-2 representation such as F or A1, favoring F.

Options B and E appear to be quite similar IMO, and prioritize human readability over processing efficiency and simplicity. Particularly E introduces what I think is unnecessary processing overhead while still failing to achieve a compact representation for power of 2 sample rates.

Options C and D seem most likely to invite implementation issues. They require all participants in the trace to parse with the same precision, and the representation is still not as compact as some other options.

cartermp commented 1 year ago

The main factor to consider for human readability for me is ease of debugging when inspecting data on the wire. The questions I'd have for that are:

  1. How likely is it that someone needs to inspect this data for debugging? Imagine not just networking tools, but inspecting a value in a visual debugger like with Jetbrains IDEs --> what will the value show in tools like that?
  2. How much does this matter for the kind of people who inspect this data?

This may already be understood by folks here, but for any casual observer, we should also make it clear that the main human interaction points - configuration and reading a value from an observability backend - should prioritize human readability.

kentquirk commented 1 year ago

@dyladan F is not really a "base 2 representation" - it simply doesn't allow for many common sample rates. While one can synthesize a sample rate of 10 by alternating between 8 and 16, it's not at all obvious what's going on when you look at the resulting telemetry.

dyladan commented 1 year ago

The main factor to consider for human readability for me is ease of debugging when inspecting data on the wire. The questions I'd have for that are:

  1. How likely is it that someone needs to inspect this data for debugging? Imagine not just networking tools, but inspecting a value in a visual debugger like with Jetbrains IDEs --> what will the value show in tools like that?

I'd guess fairly unlikely unless that person is developing the SDK itself.

  1. How much does this matter for the kind of people who inspect this data?

As an SDK developer myself I can say that I personally don't find human readability to be an important part of that process. It is "nice to have" in some cases, but I really just want to see that the sample rate sent in the header is the one I configured, and that the data is correct in the exported OTLP telemetry (which is also not human readable).

This may already be understood by folks here, but for any casual observer, we should also make it clear that the main human interaction points - configuration and reading a value from an observability backend - should prioritize human readability.

Completely agree that configuration should prioritize human simplicity. For representation in the UI of an observability backend, that is up to each backend to figure out.

@dyladan F is not really a "base 2 representation" - it simply doesn't allow for many common sample rates. While one can synthesize a sample rate of 10 by alternating between 8 and 16, it's not at all obvious what's going on when you look at the resulting telemetry.

Sorry I was a little cavalier with mixing "base 2" and "power of 2". F is a power of 2 strategy. Indeed it is restrictive in the values it allows, trading some flexibility for efficiency. While, as you said, it is possible to synthesize additional sampling rates, I wouldn't expect that to be common. I think it's far more likely people would just use 8 rather than introducing that much additional complexity to force 10.

oertl commented 1 year ago

Regarding debugability, there are actually two things you would like to check, and you cannot have direct access to both: 1) Getting the sampling probability: In this case it is easier for humans to have the sampling probability as decimal or as ratio on the span. In contrast, thresholds need to be multiplied by 2^(-56) first to transform them to corresponding sampling probabilities. 2) Understanding the sampling decision: Consistent sampling works by comparing the threshold to the random number. If the threshold is on the span, it can be directly compared to the random number that is part of the trace ID to understand why the span was sampled or not. In contrast, if the sampling probability is put on the span, depending on the proposal, it is not easy, and perhaps not always possible, to accurately determine the applied threshold to reproduce the sampling decision. In any case, the mapping of the reported sampling probability to the threshold must be carefully specified and implemented.

kentquirk commented 1 year ago

@dyladan I see sampling configurations designed by users all the time, and I'm not sure I've ever seen 8, but I've seen 10 a lot, as well as 3, 50, 100, 1000, 10000. If we were to choose F, then we could still allow users to specify 10, but then the result they see after it flows through the pipeline is a mix of 8 and 16. I think this is explainable but both confusing and completely unnecessary.

dyladan commented 1 year ago

@dyladan I see sampling configurations designed by users all the time, and I'm not sure I've ever seen 8, but I've seen 10 a lot, as well as 3, 50, 100, 1000, 10000. If we were to choose F, then we could still allow users to specify 10, but then the result they see after it flows through the pipeline is a mix of 8 and 16. I think this is explainable but both confusing and completely unnecessary.

I think if we selected F we would most likely encourage users to use the rates naturally provided. Of course a sampler could use more complex strategies to mimic additional rates, but that would be an option only for advanced users. From a configuration standpoint, I think most likely we would call it something like "sample factor" and each increment would halve the sample rate.

jmacd commented 1 year ago

The sampling SIG met and, having received this feedback, has a unanimous agreement on option A1.

jmacd commented 11 months ago

We consider this issue resolved. Reviewers, please endorse https://github.com/open-telemetry/oteps/pull/235 with your approvals!

jmacd commented 4 months ago

OTEP 235 has merged. :tada: