metrics-rs / metrics

A metrics ecosystem for Rust.
MIT License
1.1k stars 152 forks source link

`set_global_recorder` should apply `Sync` bound #511

Closed Nekrolm closed 3 days ago

Nekrolm commented 2 weeks ago

Hi

This is not an issue for any sane implementations, but currently the lack of Sync restrictions makes the set_global_recorder unsound: it's possible to create data race in safe code

example:

use metrics::Recorder;
use std::cell::Cell;

struct NonSyncRecorder {
    call_counter: Cell<usize>
}

impl NonSyncRecorder {
    fn inc(&self) {
        let val = self.call_counter.take();
        self.call_counter.set(val + 1);
    }
}

impl Recorder for NonSyncRecorder {
    fn describe_counter(&self, _: metrics::KeyName, _: Option<metrics::Unit>, _: metrics::SharedString) {
        self.inc();
    }

    fn describe_gauge(&self, _: metrics::KeyName, _: Option<metrics::Unit>, _: metrics::SharedString) {
        self.inc();
    }

    fn describe_histogram(&self, _: metrics::KeyName, _: Option<metrics::Unit>, _: metrics::SharedString) {
        self.inc();
    }

    fn register_counter(&self, _: &metrics::Key, _: &metrics::Metadata<'_>) -> metrics::Counter {
        todo!()
    }

    fn register_gauge(&self, _: &metrics::Key, _: &metrics::Metadata<'_>) -> metrics::Gauge {
        todo!()
    }

    fn register_histogram(&self, _: &metrics::Key, _: &metrics::Metadata<'_>) -> metrics::Histogram {
        todo!()
    }
}

fn main() {
    metrics::set_global_recorder(NonSyncRecorder { call_counter: Default::default() }).expect("first initilization");
    std::thread::scope(|s| {
        for _ in 0..100 {
            s.spawn(|| {
                for _ in 0..10000 {
                    metrics::describe_counter!("counter", "dummy");
                }
            });
        }
    })
}

Miri output (rustc 1.80):

 cargo +nightly miri run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.00s
     Running `/home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner target/miri/x86_64-unknown-linux-gnu/debug/metrics_not_sync_bug`
error: Undefined Behavior: Data race detected between (1) non-atomic write on thread `unnamed-1` and (2) retag write of type `usize` on thread `unnamed-2` at alloc1158. (2) just happened here
   --> /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:497:31
    |
497 |         mem::replace(unsafe { &mut *self.value.get() }, val)
    |                               ^^^^^^^^^^^^^^^^^^^^^^ Data race detected between (1) non-atomic write on thread `unnamed-1` and (2) retag write of type `usize` on thread `unnamed-2` at alloc1158. (2) just happened here
    |
help: and (1) occurred earlier here
   --> src/main.rs:11:9
    |
11  |         self.call_counter.set(val + 1);
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = help: retags occur on all (re)borrows and as well as when references are copied or moved
    = help: retags permit optimizations that insert speculative reads or writes
    = help: therefore from the perspective of data races, a retag has the same implications as a read or write
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE (of the first span) on thread `unnamed-2`:
    = note: inside `std::cell::Cell::<usize>::replace` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:497:31: 497:53
    = note: inside `std::cell::Cell::<usize>::take` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cell.rs:652:9: 652:41
note: inside `NonSyncRecorder::inc`
   --> src/main.rs:10:19
    |
10  |         let val = self.call_counter.take();
    |                   ^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `<NonSyncRecorder as metrics::Recorder>::describe_counter`
   --> src/main.rs:17:9
    |
17  |         self.inc();
    |         ^^^^^^^^^^
note: inside closure
   --> src/main.rs:47:21
    |
47  |                     metrics::describe_counter!("counter", "dummy");
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: inside closure at /home/dmis/.cargo/registry/src/index.crates.io-6f17d22bba15001f/metrics-0.23.0/src/recorder/mod.rs:136:13: 136:31
    = note: inside `std::thread::LocalKey::<std::cell::Cell<std::option::Option<std::ptr::NonNull<dyn metrics::Recorder>>>>::try_with::<{closure@metrics::with_recorder<(), {closure@/home/dmis/.cargo/registry/src/index.crates.io-6f17d22bba15001f/metrics-0.23.0/src/macros.rs:269:31: 269:41}>::{closure#0}}, ()>` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:283:12: 283:27
    = note: inside `std::thread::LocalKey::<std::cell::Cell<std::option::Option<std::ptr::NonNull<dyn metrics::Recorder>>>>::with::<{closure@metrics::with_recorder<(), {closure@/home/dmis/.cargo/registry/src/index.crates.io-6f17d22bba15001f/metrics-0.23.0/src/macros.rs:269:31: 269:41}>::{closure#0}}, ()>` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:260:9: 260:25
    = note: inside `metrics::with_recorder::<(), {closure@/home/dmis/.cargo/registry/src/index.crates.io-6f17d22bba15001f/metrics-0.23.0/src/macros.rs:269:31: 269:41}>` at /home/dmis/.cargo/registry/src/index.crates.io-6f17d22bba15001f/metrics-0.23.0/src/recorder/mod.rs:127:5: 140:7
note: inside closure
   --> src/main.rs:47:21
    |
47  |                     metrics::describe_counter!("counter", "dummy");
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: this error originates in the macro `$crate::describe` which comes from the expansion of the macro `metrics::describe_counter` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error
Nekrolm commented 2 weeks ago

This issue was mentioned before https://github.com/metrics-rs/metrics/issues/171

tobz commented 2 weeks ago

@Nekrolm Hmm, yeah, I agree. Not sure what my thinking was back then, but it does seem like we might want to require that Recorder is Sync. As you put it, this isn't a problem for any sane implementation because they'll be using an approach that is likely already Sync compatible.