TimelyDataflow / differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.
MIT License
2.54k stars 183 forks source link

consolidate: fix a Miri error #394

Closed teskje closed 1 year ago

teskje commented 1 year ago

Prior to this change, Miri would produce the following error when executed on the code of consolidate_updates_slice:

error: Undefined Behavior: attempting a read access using <3403> at alloc1431[0x8], but that tag does not exist in the borrow stack for this location
  --> src/main.rs:12:16
   |
12 |             if (*ptr1).0 == (*ptr2).0 && (*ptr1).1 == (*ptr2).1 {
   |                ^^^^^^^^^
   |                |
   |                attempting a read access using <3403> at alloc1431[0x8], but that tag does not exist in the borrow stack for this location
   |                this error occurs as part of an access at alloc1431[0x8..0xc]
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <3403> was created by a SharedReadWrite retag at offsets [0x0..0x48]
  --> src/main.rs:9:24
   |
9  |             let ptr1 = slice.as_mut_ptr().offset(offset as isize);
   |                        ^^^^^^^^^^^^^^^^^^
help: <3403> was later invalidated at offsets [0x0..0x48] by a Unique function-entry retag inside this call
  --> src/main.rs:10:24
   |
10 |             let ptr2 = slice.as_mut_ptr().offset(index as isize);
   |                        ^^^^^^^^^^^^^^^^^^
   = note: BACKTRACE (of the first span):
   = note: inside `consolidate_updates_slice` at src/main.rs:12:16: 12:25
note: inside `main`
  --> src/main.rs:34:5
   |
34 |     consolidate_updates_slice(&mut v);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The warning is fixed by making sure that slice.get_mut_ptr() is only invoked a single time. It seems like calling get_mut_ptr on a slice invalidates all existing pointers to the slice. My guess is that this is because get_mut_ptr takes a &mut self and could therefore in principle swap/replace/truncate the slice buffer, which could make existing pointers dangle. get_mut_ptr doesn't do that but Rust cannot know based on the method signature only.

Fixes https://github.com/MaterializeInc/materialize/issues/19958.

frankmcsherry commented 1 year ago

This looks good to me. Would you be willing to bundle in the same changes to consolidate_slice which I believe has the same potential defect?

def- commented 1 year ago

The same failure also happens when running this repo's tests with cargo miri test --no-fail-fast --workspace: (confirmed that your change fixes it)

test algorithms::identifiers::tests::are_unique ... error: Undefined Behavior: trying to retag from <2662251> for SharedReadOnly permission at alloc996064[0x10], but that tag does not exist in the borrow stack for this location
   --> src/consolidation.rs:124:16
    |
124 |             if (*ptr1).0 == (*ptr2).0 && (*ptr1).1 == (*ptr2).1 {
    |                ^^^^^^^^^
    |                |
    |                trying to retag from <2662251> for SharedReadOnly permission at alloc996064[0x10], but that tag does not exist in the borrow stack for this location
    |                this error occurs as part of retag at alloc996064[0x10..0x1c]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <2662251> was created by a SharedReadWrite retag at offsets [0x0..0x78]
   --> src/consolidation.rs:121:24

As Frank noted, this also happens in consolidate_slice:

test algorithms::identifiers::tests::are_unique ... error: Undefined Behavior: trying to retag from <3898547> for SharedReadOnly permission at alloc1415877[0x0], but that tag does not exist in the borrow stack for this location
   --> src/consolidation.rs:61:16
    |
61  |             if (*ptr1).0 == (*ptr2).0 {
    |                ^^^^^^^^^
    |                |
    |                trying to retag from <3898547> for SharedReadOnly permission at alloc1415877[0x0], but that tag does not exist in the borrow stack for this location
    |                this error occurs as part of retag at alloc1415877[0x0..0x18]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <3898547> was created by a SharedReadWrite retag at offsets [0x0..0x60]
   --> src/consolidation.rs:58:24
    |
58  |             let ptr1 = slice.as_mut_ptr().offset(offset as isize);
    |                        ^^^^^^^^^^^^^^^^^^
help: <3898547> was later invalidated at offsets [0x0..0x60] by a Unique function-entry retag inside this call
   --> src/consolidation.rs:59:24
    |
59  |             let ptr2 = slice.as_mut_ptr().offset(index as isize);
    |                        ^^^^^^^^^^^^^^^^^^
    = note: BACKTRACE (of the first span):
    = note: inside `consolidation::consolidate_slice::<(&(i32, i32), timely::order::Product<u64, u64>), isize>` at src/consolidation.rs:61:16: 61:25
note: inside `consolidation::consolidate_from::<(&(i32, i32), timely::order::Product<u64, u64>), isize>`
   --> src/consolidation.rs:30:18
    |
30  |     let length = consolidate_slice(&mut vec[offset..]);
    |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `consolidation::consolidate::<(&(i32, i32), timely::order::Product<u64, u64>), isize>`
   --> src/consolidation.rs:21:5
    |
21  |     consolidate_from(vec, 0);
    |     ^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `operators::HistoryReplay::<'_, '_, (i32, i32), timely::order::Product<u64, u64>, isize>::advance_buffer_by`
   --> src/operators/mod.rs:197:9
    |
197 |         crate::consolidation::consolidate(&mut self.replay.buffer);
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `<operators::reduce::history_replay::HistoryReplayer<'_, (i32, i32), (i32, i32), timely::order::Product<u64, u64>, isize, isize> as operators::reduce::PerKeyCompute<'_, (i32, i32), (i32, i32), timely::order::Product<u64, u64>, isize, isize>>::compute::<i32, trace::cursor::cursor_list::CursorList<trace::rc_blanket_impls::RcBatchCursor<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, trace::cursor::cursor_list::CursorList<trace::rc_blanket_impls::RcBatchCursor<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, trace::cursor::cursor_list::CursorList<trace::rc_blanket_impls::RcBatchCursor<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, [closure@src/operators/reduce.rs:282:44: 282:77]>`
   --> src/operators/reduce.rs:851:25
    |
851 |                         batch_replay.advance_buffer_by(&meet);
    |                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside closure
   --> src/operators/reduce.rs:526:49
    |
526 |   ...                   let _counters = thinker.compute(
    |  _______________________________________^
527 | | ...                       &key,
528 | | ...                       (&mut source_cursor, source_storage),
529 | | ...                       (&mut output_cursor, output_storage),
...   |
535 | | ...                       &mut new_interesting_times,
536 | | ...                   );
    | |_______________________^
    = note: inside closure at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/dataflow/operators/generic/operator.rs:351:17: 351:61
    = note: inside closure at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/dataflow/operators/generic/builder_rc.rs:133:31: 133:46
    = note: inside closure at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/dataflow/operators/generic/builder_rc.rs:172:26: 172:51
    = note: inside `<timely::dataflow::operators::generic::builder_raw::OperatorCore<timely::order::Product<u64, u64>, [closure@timely::dataflow::operators::generic::builder_rc::OperatorBuilder<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>>::build_reschedule<[closure@timely::dataflow::operators::generic::builder_rc::OperatorBuilder<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>>::build<[closure@<timely::dataflow::StreamCore<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>> as timely::dataflow::operators::Operator<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>>>::unary_frontier<std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, [closure@src/operators/reduce.rs:346:56: 346:89], [closure@src/operators/reduce.rs:396:17: 396:37], timely::dataflow::channels::pact::Pipeline>::{closure#0}], [closure@<timely::dataflow::StreamCore<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>> as timely::dataflow::operators::Operator<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>>>::unary_frontier<std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, [closure@src/operators/reduce.rs:346:56: 346:89], [closure@src/operators/reduce.rs:396:17: 396:37], timely::dataflow::channels::pact::Pipeline>::{closure#0}::{closure#0}]>::{closure#0}], [closure@timely::dataflow::operators::generic::builder_rc::OperatorBuilder<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>>::build<[closure@<timely::dataflow::StreamCore<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>> as timely::dataflow::operators::Operator<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>>>::unary_frontier<std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, [closure@src/operators/reduce.rs:346:56: 346:89], [closure@src/operators/reduce.rs:396:17: 396:37], timely::dataflow::channels::pact::Pipeline>::{closure#0}], [closure@<timely::dataflow::StreamCore<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>> as timely::dataflow::operators::Operator<timely::dataflow::scopes::Child<'_, timely::dataflow::scopes::Child<'_, timely::worker::Worker<timely::communication::allocator::Thread>, u64>, timely::order::Product<u64, u64>>, std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>>>::unary_frontier<std::vec::Vec<std::rc::Rc<trace::implementations::ord::OrdValBatch<i32, (i32, i32), timely::order::Product<u64, u64>, isize>>>, [closure@src/operators/reduce.rs:346:56: 346:89], [closure@src/operators/reduce.rs:396:17: 396:37], timely::dataflow::channels::pact::Pipeline>::{closure#0}::{closure#0}]>::{closure#0}::{closure#0}]>::{closure#0}]> as timely::scheduling::Schedule>::schedule` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/dataflow/operators/generic/builder_raw.rs:204:9: 204:38
    = note: inside `timely::progress::subgraph::PerOperatorState::<timely::order::Product<u64, u64>>::schedule` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/progress/subgraph.rs:699:30: 699:49
    = note: inside `timely::progress::Subgraph::<u64, timely::order::Product<u64, u64>>::activate_child` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/progress/subgraph.rs:346:26: 346:42
    = note: inside `<timely::progress::Subgraph<u64, timely::order::Product<u64, u64>> as timely::scheduling::Schedule>::schedule` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/progress/subgraph.rs:312:17: 312:43
    = note: inside `timely::progress::subgraph::PerOperatorState::<u64>::schedule` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/progress/subgraph.rs:699:30: 699:49
    = note: inside `timely::progress::Subgraph::<(), u64>::activate_child` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/progress/subgraph.rs:346:26: 346:42
    = note: inside `<timely::progress::Subgraph<(), u64> as timely::scheduling::Schedule>::schedule` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/progress/subgraph.rs:312:17: 312:43
    = note: inside closure at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/worker.rs:754:57: 754:70
    = note: inside `std::option::Option::<&mut std::boxed::Box<dyn timely::scheduling::Schedule>>::map::<bool, [closure@timely::worker::Wrapper::step::{closure#0}]>` at /home/deen/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:1075:29: 1075:33
    = note: inside `timely::worker::Wrapper::step` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/worker.rs:754:26: 754:71
    = note: inside `timely::worker::Worker::<timely::communication::allocator::Thread>::step_or_park` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/worker.rs:396:38: 396:60
    = note: inside `timely::execute_directly::<(), [closure@timely::example<(), [closure@src/algorithms/identifiers.rs:106:27: 106:34]>::{closure#0}]>` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/execute.rs:159:9: 159:34
    = note: inside `timely::example::<(), [closure@src/algorithms/identifiers.rs:106:27: 106:34]>` at /home/deen/.cargo/git/checkouts/timely-dataflow-4c0cc365061cd263/b990fab/timely/src/execute.rs:126:5: 126:84
note: inside `algorithms::identifiers::tests::are_unique`
   --> src/algorithms/identifiers.rs:106:9
    |
106 | /         ::timely::example(|scope| {
107 | |
108 | |             let input = scope.new_collection_from(1 .. 4).1;
109 | |
...   |
140 | |                 .assert_empty();
141 | |         });
    | |__________^
note: inside closure
   --> src/algorithms/identifiers.rs:95:21
    |
94  |     #[test]
    |     ------- in this procedural macro expansion
95  |     fn are_unique() {
    |                     ^
teskje commented 1 year ago

Would you be willing to bundle in the same changes to consolidate_slice which I believe has the same potential defect?

Of course! It's fixed now.

antiguru commented 1 year ago

Alternatively, we should explore what the performance cost of not using unsafe in this context would be.