tokio-rs / tracing

Application level tracing for Rust.
https://tracing.rs
MIT License
5.52k stars 725 forks source link

Concurrency issues in `tracing-appender` #3131

Open gsson opened 2 weeks ago

gsson commented 2 weeks ago

Bug Report

Version

tracing v0.1.40 tracing-appender v0.2.3 tracing-subscriber v0.3.18

Platform

Any multi threaded platform/runtime

Crates

tracing-appender v0.2.3

Description

I recently had to migrate from a MakeWriter that was simply a fn() -> std::io::StdoutLock<'static> into something that could write to a file (and perform file rolling), but ran into concurrency issues when using the provided implementations in tracing-appender.

Both the tracing-subscriber crate and my own tracing-logstash crate assumes that all writes are atomic wrt a single tracing event. While the docs are a bit vague, the MakeWriter docs strongly hint at it.

Each log event is commonly written using multiple calls to write() and friends, sometimes delegating it to another library like serde_json to write the event as a JSON object.

For example:

Interleaved writes

RollingFileAppender does not write everything written to a single RollingFileWriter atomically, which can cause partial log lines to interleave (so "event one" and "event two" could end up being written as "event event two one").

RollingFileAppender::make_writer() takes a read() lock when creating the RollingFileWriter here to synchronize rolling of the file, but that does not synchronize the actual writes.

If this is intentional behaviour as an optimisation for single threaded use cases, it is not documented, and is a bit of a foot-gun.

Mid-event file rolling

Next up I considered using Mutex<RollingFileAppender> to work around this, but the impl<'a, W> MakeWriter<'a> for Mutex<W> requires W to be Write, not MakeWriter meaning it will use RollingFileAppender as Write rather than something that creates new RollingFileWriters.

Since the Write implementation on RollingFileAppender checks for file rolling and purging on each call to write() rather than when MakeWriter is called, it will lead to file rotation to occur between two writes rather than between two tracing events.

Less importantly, in the case of many small writes (such as when delegating serde_json) the extra locks and checks is causing some overhead.

NonBlocking has the same issue, where it sends each call to write() as a separate Msg::Line instead of the writes from a single MakeWriter, which again can cause interleaving.

I actually find that using/implementing Write for the appenders themselves (rather than the MakeWriter produced writers) is rather dangerous if multi threaded environments are considered a supported use case.

Workarounds

As far as I can tell there is no workaround for the mid-event file rolling issue, but both Mutex<NonBlocking> + RollingFileAppender and Mutex<RollingFileAppender> seems like they both would prevent write interleaving

gsson commented 2 weeks ago

I made a simple NonBlocking implementation here that

  1. Only sends a message when the NonBlockingWriter is dropped.
  2. Uses a MakeWriter instead of a std::io::Write on the Worker side.
  3. Each call to make_writer() on the NonBlocking side will lead to a call to make_writer() on the Worker side.

This should hopefully lead to atomic writing of events (including when files are rolled, when used with RollingFileAppender) without the use of a Mutex<NonBlocking>.

The NonBlockingBuilder is mostly compatible with the tracing-appender one.

It's only been lightly tested, but figured it could be useful to some. Suggestions for improvements are welcome.

It doesn't include the WorkerGuard of the original since it's something I don't need, and also allows me to use the standard library queues.