Open nandusmart opened 6 years ago
Firstly, sleuth should be making decisions at the first transaction, which means it is consistent per-trace. Also, sleuth will in the near future move to brave, which has parameterized sampling for example, based on http properties: https://github.com/openzipkin/brave/tree/master/instrumentation/http#sampling-policy
we are almost guaranteed to add other types of sampling to brave, and there's already ways to sample based on annotations and such. Feel free to hop on gitter to discuss more https://gitter.im/openzipkin/zipkin
server-side things:
COLLECTOR_SAMPLE_RATE is a consistent function based on trace ID https://github.com/openzipkin/zipkin/tree/master/zipkin-server#environment-variables
There's also some discussion about an agent https://github.com/openzipkin/zipkin/issues/1778 which externalizes sampling (out of your app, but maybe host scale)
There's a mostly abandoned spark streaming job as well https://github.com/openzipkin/zipkin-sparkstreaming
The Collector Sampler today, samples the spans as received by the Zipkins Collector based on the rate. But for a micro-services architecture where a single transaction traverses through multiple applications and spans, having a sampler logic to sample based on just the number of spans might cause us to loose a whole picture of a single transaction.
So we are looking for an ideal solution where the Zipkin (not the app's sleuth implementation) has options to sample the transactions based on trace(or a whole transaction with all its spans) and not based on individual spans. And we expect this to be more asynchronous sampler technique.
Looking forward for your thoughts...