Open semistrict opened 6 years ago
I would suggest instead always tracing and having "sampling" being whether or not it is reported/stored. Assuming the language/framework's span creation is lightweight it makes sense to simply create the span and if certain conditions are met (like it took over X ms to complete or an exception was thrown) sample/report the span.
We plan to add this to opencensus-erlang
and it would be nice to be a part of the spec in some way.
This only works on the Span in the current process. If you really want to get the whole trace you need need some way to tell your caller to start tracing as well. For gRPC, I can see this fitting in response metadata (AFAIK these are internally HTTP trailing headers so those might work for HTTP too, although I don't know how well that would be supported by HTTP libraries).
Yea, not sure how well that'd work, but sounds possible.
Another option would be if spans were pulled instead of pushed, so something collected spans, so if a span's trace is enabled later it is requested (along with any other spans a process wants to report) by the collector. If that makes sense.. basically like Prometheus but where the pull request could also include trace ids.
For a description of exemplars, see: https://www.youtube.com/watch?v=U72b4Nl0Ftw
Internally in the team we decided to have @g-easy and @sebright as owners of this feature. Expect a design proposal soon.
@tsloughter Tom Wilkie had a go at a Prometheus inspired pull based tracing. I've not tried it, I got the impression he didn't entirely convince him self that it was a good approach. https://www.weave.works/blog/distributed-tracing-loki-zipkin-prometheus-mashup/
It seems reasonable to limit this to http2/grpc. You could also say have thresholds on latency/response code that would guarantee collection (with some upper bounce I guess)?
re: pulled base tracing
How would the puller know the correct transitive closure of nodes to pull spans from? That closure can be wildly different per trace.
Also, the implications are that the entire network of nodes need to store all spans for some predetermined period. That predetermined period needs to be the same across the realm of nodes being traced. That seems untenable.
Any design for retroactively exporting interesting spans will I think require nodes to retain spans somewhere long enough for the sampling decision to be made.
Luckily, it's all best-effort so we can have a fixed-size buffer per node and just store as much as possible within that fixed size. Or, we could sample just at a much higher rate.
I have been thinking about a way of doing this that leverages existing central storage systems. Instead of moving to a pull-based approach, store somewhere central (e.g. memcached, redis, a database) a set of bloom filters for interesting traces, one per 10 second interval for the last 120 seconds (for example).
When you want to mark a trace as "interesting" and to be exported (for example, if an error occurs) you add the trace ID to the active bloom filter.
All nodes periodically read the bloom filters and export any spans with matching trace IDs.
Old bloom filters can just be dropped. A new bloom filter should be created as the active one for each new 10s period.
Through trace sampling, we might miss important traces that don't occur very frequently for example traces leading to error conditions or high latency.
We should provide a facility for starting tracing later during request processing when we detect an error or other interesting condition. We should rate limit this at the source to avoid cascading failure.