carllerche commented 6 years ago

Summary

Add a new component, Tokio Trace which provides a scoped, structured logging system. It is scoped in that each event tracks its parent. It is structured in that events are a set of key / value pairs with the associated event information.

Tokio Trace would provide the foundation upon which Tower can build request scoped tracing and integration with systems like open tracing or other instrumentation systems can be built.

Motivation

Tracking the logic flow through an asynchronous system like Tokio can be quite challenging. Since individual tasks are multiplexed on the same thread, associated events and log lines are intermixed making it difficult to trace the logic flow.

There also are complementary goals that can be accomplished with a well built event foundation. For example, metrics / instrumentation can be tracked by observing emitted events.

There is plenty of precedence upon which to draw. This proposal draws heavily from tracing systems like Dapper, Zipkin, and Open Tracing.

Goals

Tokio Trace would provide an API that allows libraries to emit information around events. Subscribers could then be attached and process these events. The API needs to be low enough overhead to allow include debug level events and the API needs to avoid adding any overhead when there are no subscribers that care about the information.

Event scoping

Tokio Trace will provide a tracing style API. Roughly, this means that events have a start and an end point. Also, events can be nested.

For example, imagine an HTTP request that requires a database query to build the response. An event would be processing the HTTP request. This would start when the request is received and complete when the response has been written to the socket. The database query is a child event and is fully contained by the HTTP request event. The nesting should be tracked so that subscribers can rebuild the structure.

Causality

Besides just parent / child relationships, there is also a need to track more complex causality. One example is the h2 crate. The HTTP 2.0 protocol allows request / response exchanges to be multiplexed on a single TCP socket. To implement this, a dedicated task is spawned to manage the TCP socket. Each HTTP 2.0 request / response exchange interacts with the TCP socket task via message passing. Because of this structure, there is no clear parent / child relationship between the logic processing the request / response exchange and the logic that does the actual TCP work for that request / response exchange. However, we still want to track the causality to aid in debugging.

To support this goal, events may have an arbitrary number of follows_from annotations. This creates a relationship between two events from separate traces.

Structured logging

Traditionally, logging systems use text strings as log entries. This is simple, but loses a lot of of the event details. Structured logging represents events as a set of key / value pairs, maintaining the original type of the event data. Tokio Trace will use such a system. For example:

let s = span!("my_event", foo = 123, bar = "wut", message = "hello world");
// Do work, including child spans
drop(s);

Values will most likely be limited to a set of rust types found in std or core.

Design requirements

Tokio Trace is split up into multiple crates:

tokio-trace is the crate that libraries and applications depend on to get instrumentation APIs.
tokio-trace-subscriber contains utilities for implementing event subscribers as well as some default implementations (such as a STDOUT logger).
tokio-trace-core is the bare minimum needed to provide a bridge between tokio-trace and tokio-trace-subscriber.

API stability

The tokio-trace-core crate must provide a stable API. When a library changes its tokio-trace-core dependency version, it is considered a breaking change for that library.

The other crates, tokio-trace and tokio-trace-subscriber may release breaking changes at a faster pace.

Because of this, the tokio-trace-core crate's initial release must be bare minimum needed to function. It must also be designed with forwards compatibility in mind.

Performance

It is critical that tokio-trace-core adds minimal overhead to instrumented libraries. Subscribers must opt-in to instrumented events. If no subscriber opts in to an instrumented event, the cost of that point of instrumentation should be as close to zero as possible.

Even when subscribers opt into events, instrumentation points should be as cheap as possible.

Regarding performance, it is assumed that hitting instrumentation points is the hot path and creating a subscriber is uncommon (most likely once at the beginning of the process).

API

The specifics of the API are still very much in the air. A sketch can be found here.

Given the goals stated above, defining an event must have a start and an end. Given the asynchronous nature of working with Tokio, this will require being able to have a value to pass around into futures. Also note that spans need to hook into a future's poll function:

struct MyFuture {
   // data
   span: tokio_trace::Span,
}

impl Future for MyFuture {
    type Item = ();
    type Error = ();

    fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
        self.span.enter(|| {
            // Do actual future work
        })
    }
}

A higher level combinator API can be provided:

my_future.instrument(/* args here */)

Again, the details are TBD, but the general goal is to incur as little overhead as possible if there are no subscribers to the event or if there is a subscriber, but that subscriber does not care about certain aspects, such as building a trace.

Subscribers

One or more subscribers need to be registered to receive events. A subscriber can receive events and process them however is appropriate. The goal is for the tracing API used by libraries to not do any work, instead punting the work to subscribers.

A subscriber should be able to subscribe to a subset of events, for example the subscriber might decide to only receive events from the h2 module.

Things a subscriber might do:

Log the event to some logger (std, file, etc...). This would require taking the structured data and converting it to a string.
Instrument some event. For example, an HTTP request would emit an event and the subscriber could track the rate at which the event occurs.
Build a trace and track the full causality of events, then emit the trace to an aggregator (zipkin, open tracing, etc...)

Integrating higher up in the stack

The Tokio Trace crate would provide the necessary infrastructure for emitting and subscribing to events. It is then up to libraries to use the API to emit events.

Tokio itself would instrument its APIs as a starting point. Once the API and performance is deemed acceptable, there would be an effort to instrument higher levels in the stack.

Tower

Tower is a request / response abstraction and a middleware stack. This would allow instrumenting events at the request / response level. The request / response point is where most existing tracing systems start.

Hyper

Hyper is an HTTP client and server implementation and would be able to provide valuable event data. It would be able to instrument the exact start and end points of the HTTP request.

Conclusion

This is issue represents my current thinking on Tokio Trace and the direction that I hope it goes. It is mostly a brain dump, but is a good starting point to get the discussion going.

Instrumentation and visibility are pretty critical to production applications and I think getting some solid infrastructure is an important next step.

davidbarsky commented 6 years ago

I really like this proposal! Thanks for sharing it. I have two questions I'd like to clarify:

My view and experience with distributed tracing systems is that a trace could be derived from an event (or span, as this proposal calls them) if a request ID is present, but not vice versa. In that case, you'd expect that the subscriber would be responsible for a converting a wide span into a more narrow metric?
The span! macro is better than what I had implemented. Question about this phrase: “Values will most likely be limited to a set of rust types found in std or core”. An idea I had is that any type implementing serde::ser::Serialize could be inserted into a span. Is that a reasonable approach?

carllerche commented 6 years ago

@davidbarsky

Re 1.

This proposal is primarily for process-local tracing, not distributed. However, I expect distributed tracing to be able to be implemented on top of it. You would need a combination of a subscriber that is distributed tracing aware AND a tower middleware that extracts the request ID from the request and informs the subscriber. At that point, the subscriber would have linked the process-local trace w/ a distributed trace request ID.

If no request ID is provided, then the process-local trace cannot be associated with a distributed trace. However, there is still utility for debugging, logging, metrics, etc...

I also expect there to be multiple subscribers in a given process. For example, you would have your distributed tracing subscriber that associates the process-local trace w/ a request ID and ships it out, and you would also have your metrics aggregator, that just aggregates metrics like load, request rate, error rate, etc... and sends them to prometheus, and finally maybe a logging subscriber that generates logs to a file (or whatever logging aggregator).

Re 2.

Values will most likely be limited to a set of rust types found in std or core

The main issue is subscribers are generic and there will be a level of dynamism between event emitting and subscribers. In other words, there will most likely be a Subscriber trait and emitting events will push into it via a trait object. This means that you cannot (easily) have generics as part of the API.

any type implementing serde::ser::Serialize could be inserted into a span

Definitely interesting, but tricky due to generics. I would ask what the goal is and we can see what is possible.

The difficulty is with generics.

Lets assume that the Subscriber trait looks something like this (note the API is completely not the direction we will take... only for illustration):

trait Value { ... }
trait Subscriber {
    fn tag_span<T: Value>(&self, id: usize, key: &'static str, value: T);
}

The problem is that Subscriber is not trait object safe. So, one way to make it work would be:

trait Value { ... }
trait Subscriber {
    fn tag_span(&self, id: usize, key: &'static str, value: &Value);
}

But then Value as a trait needs to be trait object safe.

However, serde's Serialize trait is not trait-object safe because serialize takes a generic.

There are ways to work around this, but are pretty heavy (lots of allocations).

seanmonstar commented 6 years ago

The structured logging part sounds very similar to the proposed addition to the log crate: https://github.com/rust-lang-nursery/log/issues/149

They're exploring how to have generic key-values, while still being object safe and reducing allocations.

davidbarsky commented 6 years ago

@carllerche:

This proposal is primarily for process-local tracing, not distributed.

Gotcha! I've heard the phrase “event“ used in this context—it threw me off for a bit. It might help to settle upon a shared vocabulary.

However, I expect distributed tracing to be able to be implemented on top of it. You would need a combination of a subscriber that is distributed tracing aware AND a tower middleware that extracts the request ID from the request and informs the subscriber. At that point, the subscriber would have linked the process-local trace w/ a distributed trace request ID.

Makes sense! You're right that it's a high-level concern, depending on middleware to function correctly.

If no request ID is provided, then the process-local trace cannot be associated with a distributed trace. However, there is still utility for debugging, logging, metrics, etc...

Strongly agree—I've found that it's super useful to emit a single event/span/whatever per unit of work.

I also expect there to be multiple subscribers in a given process. For example, you would have your distributed tracing subscriber that associates the process-local trace w/ a request ID and ships it out, and you would also have your metrics aggregator, that just aggregates metrics like load, request rate, error rate, etc... and sends them to prometheus, and finally maybe a logging subscriber that generates logs to a file (or whatever logging aggregator).

👍🏻

The main issue is subscribers are generic and there will be a level of dynamism between event emitting and subscribers. In other words, there will most likely be a Subscriber trait and emitting events will push into it via a trait object. This means that you cannot (easily) have generics as part of the API [code samples omitted]...But then Value as a trait needs to be trait object safe.

That's a good point—I didn't consider object safety here.

I would ask what the goal is and we can see what is possible.

My goal would be to provide some sort of trait that indicates a type is instrumental. For instance, being able to trivially insert headers/URI from http into a span would be really useful in that it provides a lot of debugging context and information. Serde seems popular and prolific enough that it might meet that purpose.

@seanmonstar:

The structured logging part sounds very similar to the proposed addition to the log crate: rust-lang-nursery/log#149

Yeah! I think the biggest difference is that a span would correspond to a single unit of work, while a log doesn't. My mental model—and my employer's—align pretty closely with Honeycomb's.

They're exploring how to have generic key-values, while still being object safe and reducing allocations.

Naturally—it's worth paying close attention to.

Thomasdezeeuw commented 6 years ago

I think it would be great if this wouldn't be Tokio specific, but the default for the entire Rust ecosystem. Like @seanmonstar suggested a crate like log (or maybe a similar generic crate) would be good place for this.

carllerche commented 6 years ago

The implementation won’t be Tokio specific, but it will live (at least initially) under the Tokio umbrella.

We can worry about extracting / crate name logistics once the crate exists. On Sat, Sep 1, 2018 at 9:21 AM Thomas de Zeeuw notifications@github.com wrote:

I think it would be great if this wouldn't be Tokio specific, but the default for the entire Rust ecosystem. Like @seanmonstar https://github.com/seanmonstar suggested a crate like log (or maybe a similar generic crate) would be good place for this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tokio-rs/tokio/issues/561#issuecomment-417870349, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAYJPzRtgahZ1kT6HUoto2S3ks1atEXks5uWrQRgaJpZM4WIhrH .

austbot commented 6 years ago

@carllerche how can the community help? Where do we start?

carllerche commented 6 years ago

@austbot Great question. This issue represents a brain dump. I also had an initial sketch one (linked in the original post).

What I would suggest:

1) Join the dev gitter: https://gitter.im/tokio-rs/dev 2) Use what is in this issue to start experimenting with what a tokio-trace crate could be. 3) Ask questions in Gitter and here. I can iterate on the original post to clarify the design and scope.

@davidbarsky has also expressed interest. I'm not sure what he has done so far. Maybe you two can talk in Gitter some. I think that at this point it would be worth having multiple people experiment with different ideas. We can then look at the various strategies and merge the good ideas together.

davidbarsky commented 6 years ago

Sorry for the delay in responding—interns started last week, and supporting them is a lot of effort.

Main thing I looked at is looking at prior art in tracing systems like OpenCensus and [Jaeger] with regard to sampling in trace collection. It appears they support application-level sampling, which might change some parts of this design.

carllerche commented 5 years ago

Some metrics to expose:

When building applications with tokio, it can be hard to understand reactor/executor behavior at runtime.

I want to be able to ask a few questions of the runtime like the following:

How many tasks have been scheduled? [total since runtime creation]
How many tasks are currently scheduled? [instantaneous]
How much time has been sent polling tasks? [total since runtime creation]

It's important that these values are not time-bounded or otherwise pre-rated so that they fit well into timeseries -- by tracking totals we can determine the rate at read-time without having to configure the runtime with a window...

Copied from: #147

najamelan commented 5 years ago

This is very interesting. Currently it's not straight forward to log async/multithreaded code. How do you imagine this? For a stdout subscriber for example would there be a tradeoff delay vs readability? If you log only once every second, you can de-interleave log statements from different tasks to make it readable?

Note that terminals as well as files could also use a different strategy, in that they can be rewritten, eg. log instantly, but as new stuff comes in, rewrite the events to change the order on screen/in file?

I'm really curious.

carllerche commented 5 years ago

For stdout, I would log immediately w/ a trace identifier. This would allow filtering by trace via grep. There is also a GSOC project to build an interactive terminal app that allows richer filtering / rendering: https://github.com/tokio-rs/gsoc/issues/1

tokio-rs / tokio

Tokio Trace: A scoped, structured, logging system #561

Summary

Motivation

Goals

Event scoping

Causality

Structured logging

Design requirements

API stability

Performance

API

Subscribers

Integrating higher up in the stack

Tower

Hyper

Conclusion