Open Oberon00 opened 3 years ago
If it is a problem, since trace IDs can come from untrusted, non-random sources, do we open up a DDoS/Security/performance issue when using trace IDs as sole, deterministic input for our sampling algorithm? Do we need to put a warning there?
If we do re-use the traceID from untrusted sources we have bigger problems than the sampler. I think that topic should be addressed independently and separately. For example you may overload the backend, you may overload the collector if you are doing a group by trace id, etc. I think there are a lot of failing scenarios when the trace id is untrusted that it deserve its own section and should be resolved at the server span level. One solution that I proposed was to simply use the incoming span context as a link instead of a parent. we can discuss more about that :)
If we do re-use the traceID from untrusted sources we have bigger problems than the sampler. I think that topic should be addressed independently and separately.
I think you are right. I will remove this item from the enumeration above. Is #1188 the right issue for that discussion?
Is this a problem at all? Or is the ParentBased approach enough (in combination with any, not necessarily trace-id-based probability-based sampling of root spans)?
I've seen implementation that use "inflationary sampling probability", and the algorithm is important for them. For example if Service A (0.2% sampling) -> Service B (0.4% sampling) and you want to end up with about 0.2 full traces and another 0.4 (at Service B level, including the Service A traces).
The algorithm that I've seen (I think @jmacd was one of the author) is that if you have the same algorithm across Services (which means across languages as well) that has a deterministic implementation, and also ensures that every trace sampled at a lower rate will be sampled at a higher rate, then you can achieve that.
traceId1 = "..." // sampled at 0.2 rate traceId2 = "..." // sampled at 0.4 rate, but not at 0.2 traceId3 = "..." // not sampled at 0.4 rate
ServiceASampler = new TraceIdBasedSampler(0.2):
traceId | result |
---|---|
1 | true |
2 | false |
3 | false |
ServiceBSampler = new TraceIdBasedSampler(0.4):
traceId | result |
---|---|
1 | true |
2 | true |
3 | false |
The algorithm that I've seen (I think @jmacd was one of the author)
FYI for the record this wasn't my doing. This "inflationary" sampling technique predated me on that project. 😀
The Go implementation of this algorithm has lead to an incompatibility Amazon XRay Trace IDs. The first 4 bytes of XRay Trace IDs are time based and the Go ratio sampler expects these bytes to be random.
It would be advantageous to the Go SIG, and likely others, if we could resolve this issue. That way, when we change our algorithm we will only need to do so once.
@Aneurysm9 had mentioned we switch to "hashing" the middle part of the trace ID to make a sampling decision.
Current Go implementation for reference:
func (ts traceIDRatioSampler) ShouldSample(p SamplingParameters) SamplingResult {
psc := trace.SpanContextFromContext(p.ParentContext)
x := binary.BigEndian.Uint64(p.TraceID[0:8]) >> 1
if x < ts.traceIDUpperBound {
return SamplingResult{
Decision: RecordAndSample,
Tracestate: psc.TraceState(),
}
}
return SamplingResult{
Decision: Drop,
Tracestate: psc.TraceState(),
}
}
@MrAlias that is why the w3c specification is actually moving to make it explicit which bytes should be random. You can see the draft here https://github.com/w3c/trace-context/blob/main/spec/20-http_request_header_format.md#trace-id
The reason we used 7 and not 8 bytes is that the 8th byte contains the sign bit. This is the same reason go has the method to generate a 63 bit random number.
@MrAlias that is why the w3c specification is actually moving to make it explicit which bytes should be random. You can see the draft here https://github.com/w3c/trace-context/blob/main/spec/20-http_request_header_format.md#trace-id
The reason we used 7 and not 8 bytes is that the 8th byte contains the sign bit. This is the same reason go has the method to generate a 63 bit random number.
Ah, super helpful! Thanks 🙏
Is the plan for OTel to just adopt this ^ when it lands?
@MrAlias that is why the w3c specification is actually moving to make it explicit which bytes should be random. You can see the draft here https://github.com/w3c/trace-context/blob/main/spec/20-http_request_header_format.md#trace-id
The reason we used 7 and not 8 bytes is that the 8th byte contains the sign bit. This is the same reason go has the method to generate a 63 bit random number.
@dyladan does this mean that the AWS XRay TraceID which uses a non psudo-random value for the left-most 4 bytes (based on the time) would not be W3C complaint?
Looking into switching the Go implementation to the mentioned algorithm, I think the XRay spans would continue to be sampled in a non-random manner given this static prefix.
cc @Aneurysm9
Trace IDs are 16 bytes, so there should be no issue.
| 0 1 2 3 | 4 5 6 7 | 8 | 9 a b c d e f |
A B C D
X-Ray IDs will have non-random data in region A
(bytes 0-3) and random data in regions B
, C
, and D
(bytes 4-f). The w3c proposal is to guarantee that region D
(bytes 9-f) contain random data. The current implementation uses regions A
and B
, shifted right by one place so it is effectively bytes 0-6:
x := binary.BigEndian.Uint64(p.TraceID[0:8]) >> 1
Instead, we could use region D
directly:
x := binary.BigEndian.Uint64(p.TraceID[9:])
This would also make the sampler safe to use with 64-bit trace IDs still generated by some legacy systems.
I am expecting us to use the W3C trace context "random" flag to address this issue: https://github.com/w3c/trace-context/pull/474 @dyladan can you provide guidance?
We spoke about this at the w3c meeting yesterday actually. I was going to bring it up at the next maintainers meeting. The level 2 spec for trace context was delayed by some extended summer vacations but is about to go into wide review for publication as a recommendation. Obviously until something is an official recommendation the working group can't guarantee anything, but we do not expect any major changes. Here are the important points:
@dyladan does this mean that the AWS XRay TraceID which uses a non psudo-random value for the left-most 4 bytes (based on the time) would not be W3C complaint?
No. The randomness requirement only applies to traceparents where the random
flag is set to 1
for backwards compatibility reasons. Also, the nonrandom bits for xray are the leftmost bits and the flag only applies to the rightmost bits. I am not sure how the rest of the ID is generated.
edit: Missed that @Aneurysm9 clarified the region in question is random.
@dyladan can you provide guidance?
I would feel safe using the rightmost 7 bytes as my random number, and the fewer rightmost bytes that are used the safer I would feel. Hashing the random part of the ID or the whole ID would be another way to guarantee safety, but also comes at the cost of implementation complexity (and ensuring all implementations are the same).
I would probably recommend to restrict to inverse power of 2 sampling probabilities (1/2, 1/4, 1/8, etc) which would allow you to use the minimum number of bits without hashing. For example, a 50% sampling rate only needs the single rightmost bit, a 25% sampling rate needs only the rightmost 2 bits, etc. This has also been discussed to have other benefits with respect to @jmacd's and @oertl's probability propagation proposal.
edit: alternative to power of 2 restriction would be restricting probability to a whole number percent. Only 7 bits are required to represent every whole number from 1 to 99
@dyladan Thank you. I am pleased to see that we are nearly ready to adopt the random bit for w3c traceparent. Those familiar with the current tracestate-based proposal for probability sampling may remember that we would be able to eliminate the r
variable once we had a consistent random source. Meanwhile, the p
variable would still compactly propagate the power-of-two sampling probabilities (for use in span-to-metrics pipelines), but that if we had "plenty" of random bits available we could easily extend that specification to accommodate propagating non-powers-of-two sampling probabilities.
If we have 7-bytes of randomness, it means we can agree to a consistent method to evaluate TraceIDRatioBased sampling policy like the OTel-Go example in https://github.com/open-telemetry/opentelemetry-specification/issues/1413#issuecomment-1244007561.
p value propagation I hope is next in level 3. for now it was agreed to leave it out of the level 2 spec because the r value was less controversial and less likely to get held up in wider review
We discussed this in the 9/22 Sampling SIG. The action items were loosely discussed and will continue in the next SIG (10/6).
My opinions, roughly, are:
The same decision would be returned by a Consistent Probability Sampler as in the experimental specification here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md.
Moreover, the current experimental specification can be updated to rely on the W3C randomness bit, which is a huge improvement for us (thank you W3C TraceContext group, thank you @dyladan!) as follows:
T
equals a power-of-two (i.e., has one bit set)T
is not a power-of-two, encoded presumably in hexadecimal. To express a sampling rate like 3-in-4, we want a threshold like T=c0000000000000
(which makes me want to drop trailing zeros of the threshold, so 3-in-4 could be encoded as just T=c
).cc: @oertl @spencerwilson @PeterF778 @kalyanaj @dyladan
Thanks for the update. @jmacd mind sharing what the reasoning was behind using 56 bits? Is it just because that is the number the W3C already has in the draft spec or was there some need for sampling thresholds with that level of randomness?
I do not believe anyone requires 56-bits of sampling precision. I'm interested in what others think is a good value, maybe 16 or 20 bits will do.
@jmacd
I do not believe anyone requires 56-bits of sampling precision. I'm interested in what others think is a good value, maybe 16 or 20 bits will do.
I am not sure if 16 bits are enough if really arbitrary sample rates should be supported. With 16 bits, the smallest possible sampling rate would be 1/2^16 = 0.00001525878 and the second smallest possible sampling rate would be 2/2^16 = 0.00003051757. There is a large relative gap between these two sampling rates.
@jmacd
- When a W3C tracecontext w/o the new random flag is sampled, SDKs should use an unspecified hashing algorithm on the TraceID to construct 56 questionably-random bits. They should not expect consistent sampling, in this case, and should restrict usage to root-only sampling.
If the hashing algorithm is not specified, the SDKs can simply use any random number. Using the trace ID is not very useful in this case.
For an assessment on the lower end of needed probabilities, let's look at one example. Google handles about 100 billion requests daily. If we design long term storage for traces which decreases traces cardinality as the data gets older, and keeps only 1000 traces per day (for example, for data older than 5 years), we need probability of about 2^-28. So, I'd say, we need at least 31 random bits, and this is still playing with chances.
When a W3C tracecontext w/o the new random flag is sampled, SDKs should use an unspecified hashing algorithm on the TraceID to construct 56 questionably-random bits.
When the new random flag is NOT set, couldn't we still require SDKs to use the SAME hashing algorithm (instead of an unspecified algorithm)? i.e., a best effort in treating that the same last set of bytes in traceID is random...
Reasoning: My understanding is that the TraceID generated by many systems today, though not required by the Level 1 of the W3C TraceContext spec, have their rightmost bytes randomly generated. So, shouldn't we do a "best effort" consistent probability sampling as adoption of the new flag can take a while (W3C TraceContext level 2 spec has to get to recommendation stage, implementations have to adopt it etc.).?
This was discussed by me @oertl @PeterF778 @kalyanaj and @kentquirk in the Sampling SIG today. Notes:
The W3C random
flag is anticipated to go ahead, but we also expect it could be a year-or-so before it is deployed in OTel SDKs.
We debated whether 56-bits, 48-bits, or 32-bits of randomness would be preferred. There is not a strong preference between 48 and 56, but we think 32 bits is not sufficient.
There was a brief question of whether we might wish to reserve some (e.g., 8) bits of the TraceID for future/alternative use on the assumption that 16-bytes is more than sufficient for global uniqueness (provided 48 bits are truly-random bits). For example, we could directly encode today's powers-of-two-sampling r-value
using 6 bits of the TraceID, which drastically reduces the amount of randomness required per trace for consistent probability sampling. (Why: the step to generate r-value
uses 2 (expected) random bits per TraceID.). On the other hand, with W3C support we could simply append extra bytes to the traceparent
header to reduce the cost of randomness by a logarithmic factor.
We discussed hashing approaches and were reminded why we don't like them (they're expensive, faulty, and not portable, see https://github.com/rurban/smhasher).
We discussed how to test for suitably random TraceIDs, it could follow this previous work: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md#appendix-statistical-test-requirements
We arrived at the following recommendation to address this issue. Assume that the least-significant 7 bytes of the TraceID are random _as though the anticipated W3C random
flag were set. Thus, we will generate a 56-bit number and follow exactly the recommendation made by @Aneurysm9, i.e.,
This is compatible with X-ray and we believe it is compatible with all existing OTel SDKs. The work remaining for this issue, if the proposal is accepted, will be to update the Trace SDK specification with details. The TraceIDRatioBasedSampler.ShouldSample() logic uses
traceThreshold := uint64(samplingProbability * (1 << 57))
func ShouldSample() bool {
traceValue := binary.BigEndian.Uint64(p.TraceID[9:])
return traceValue < traceThreshold
}
The group reasons that this is no worse than doing nothing at all, and assuming that the W3C proposal does not change this is also forward-compatible.
This traceThreshold
shown above can be used to convey non-power-of-two adjusted counts. They could be propagated (as described in my comment above) using a tracestate t
-value with an update to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md.
During our group meeting on Oct 20, @PeterF778, @kalyanaj, @spencerwilson and @kentquirk identified some use cases which are easier to handle if sampling decisions are based on r-value generated independently from the trace-id. One is with tracking user sessions and the other is with linked traces.
Assuming non-instrumented browser, user sessions involve several requests to some backend, each generating a new trace. It is beneficial to keep all these traces consistently sampled. This can be achieved by generating the r-value for the first request as usual, but reusing it for all traces belonging to the same session. This requires a mapping from the session-id to the r-value, which can be technically challenging, but should be feasible.
Linked traces can be used in a number of ways, but one typical use case is when one trace leaves a message in queue to be picked up by another trace hours or days later. The root span of the consuming trace links itself to the producing trace. Again, it is beneficial to make the same sampling decisions for both traces. This can be helped by the consuming trace cloning the r-value from the producing trace.
@PeterF778 would you agree that would be possible for an SDK to do what you described by fixing the 7 bytes of random TraceID and then generating multiple correlated TraceIDs from the single random source?
Can you see any problems that might result from avoiding r-value, in that case?
@PeterF778 would you agree that would be possible for an SDK to do what you described by fixing the 7 bytes of random TraceID and then generating multiple correlated TraceIDs from the single random source?
Can you see any problems that might result from avoiding r-value, in that case?
In the SDK trace-id is generated automatically when the root span is created. There are no mechanisms that could be used to customize this behavior. In contrast, the r-values are created within calls to Sampler.shouldSample() where span attributes are available.
Reusing the 7 random bytes of trace id remains a theoretical possibility, but it would be very hard to implement. Even if we could somehow get it to work, such a change could break some vendors' features if they assume uniqueness of these 7 bytes.
@jmacd Is this something for sampling SIG?
See also discussion on https://github.com/open-telemetry/opentelemetry-specification/pull/1412#issuecomment-776101910
The sampling alorithm for TraceIdRatioBasedSampler is unspecified. As a result, trace IDs that are sampled by some implementations might get non-sampled or re-sampled by SDKs in other languages, even though they have the same or a a higher probability than the parent.
TODO list for this issue:
If it is a problem, since trace IDs can come from untrusted, non-random sources, do we open up a DDoS/Security/performance issue when using trace IDs as sole, deterministic input for our sampling algorithm? Do we need to put a warning there?Let's assume it is no problem for this issue, this should be handled in #1188.