TimelyDataflow / differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.
MIT License
2.51k stars 182 forks source link

Getting Started Guide for Newcomers Doesn't Work #483

Closed jon-whit closed 2 months ago

jon-whit commented 2 months ago

I'm reading along the "Getting Started" guide and am not able to get the first program to work.

Steps to Reproduce:

  1. Install the latest release of Rust.

    brew install rust
  2. Create a new Rust project

    cargo new differential-dataflow-example
  3. Copy/paste the source snippet from "Step 1: Write a Program" into src/main.rs.

  4. Run the source

    RUST_BACKTRACE=1 cargo run -- 10

I get the error:

thread 'worker thread 0' panicked at library/core/src/panicking.rs:156:5:
unsafe precondition(s) violated: ptr::swap_nonoverlapping requires that both pointer arguments are aligned and non-null and the specified memory ranges do not overlap
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread caused non-unwinding panic. aborting.
[1]    64515 abort      cargo run -- 10
➜  differential-dataflow-example git:(main) ✗ vim src/main.rs
➜  differential-dataflow-example git:(main) ✗ RUST_BACKTRACE=1 cargo run -- 10
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/differential-dataflow-example 10`
thread 'worker thread 0' panicked at library/core/src/panicking.rs:156:5:
unsafe precondition(s) violated: ptr::swap_nonoverlapping requires that both pointer arguments are aligned and non-null and the specified memory ranges do not overlap
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_nounwind_fmt
   2: core::panicking::panic_nounwind
   3: core::ptr::swap_nonoverlapping::precondition_check
             at /private/tmp/rust-20240503-6621-3k19qi/rustc-1.78.0-src/library/core/src/intrinsics.rs:2799:21
   4: core::ptr::swap_nonoverlapping
             at /private/tmp/rust-20240503-6621-3k19qi/rustc-1.78.0-src/library/core/src/ptr/mod.rs:1022:5
   5: core::mem::swap
             at /private/tmp/rust-20240503-6621-3k19qi/rustc-1.78.0-src/library/core/src/mem/mod.rs:742:29
   6: differential_dataflow::consolidation::consolidate_updates_slice
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/consolidation.rs:124:21
   7: differential_dataflow::consolidation::consolidate_updates_from
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/consolidation.rs:86:18
   8: differential_dataflow::consolidation::consolidate_updates
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/consolidation.rs:77:5
   9: differential_dataflow::trace::implementations::merge_batcher::MergeSorter<D,T,R>::push
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/trace/implementations/merge_batcher.rs:209:13
  10: <differential_dataflow::trace::implementations::merge_batcher::MergeBatcher<K,V,T,R,B> as differential_dataflow::trace::Batcher<K,V,T,R,B>>::push_batch
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/trace/implementations/merge_batcher.rs:37:9
  11: <differential_dataflow::trace::rc_blanket_impls::RcBatcher<K,V,T,R,B> as differential_dataflow::trace::Batcher<K,V,T,R,alloc::rc::Rc<B>>>::push_batch
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/trace/mod.rs:362:63
  12: <differential_dataflow::collection::Collection<G,(K,V),R> as differential_dataflow::operators::arrange::arrangement::Arrange<G,K,V,R>>::arrange_core::{{closure}}::{{closure}}::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/operators/arrange/arrangement.rs:570:25
  13: timely::dataflow::operators::generic::handles::InputHandle<T,D,P>::for_each
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/dataflow/operators/generic/handles.rs:92:13
  14: timely::dataflow::operators::generic::handles::FrontieredInputHandle<T,D,P>::for_each
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/dataflow/operators/generic/handles.rs:136:9
  15: <differential_dataflow::collection::Collection<G,(K,V),R> as differential_dataflow::operators::arrange::arrangement::Arrange<G,K,V,R>>::arrange_core::{{closure}}::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/differential-dataflow-0.11.0/src/operators/arrange/arrangement.rs:567:21
  16: <timely::dataflow::stream::Stream<G,D1> as timely::dataflow::operators::generic::operator::Operator<G,D1>>::unary_frontier::{{closure}}::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/dataflow/operators/generic/operator.rs:352:17
  17: timely::dataflow::operators::generic::builder_rc::OperatorBuilder<G>::build::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/dataflow/operators/generic/builder_rc.rs:132:13
  18: <timely::dataflow::operators::generic::builder_raw::OperatorCore<T,L> as timely::scheduling::Schedule>::schedule
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/dataflow/operators/generic/builder_raw.rs:203:9
  19: timely::progress::subgraph::PerOperatorState<T>::schedule
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/progress/subgraph.rs:646:30
  20: timely::progress::subgraph::Subgraph<TOuter,TInner>::activate_child
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/progress/subgraph.rs:329:26
  21: <timely::progress::subgraph::Subgraph<TOuter,TInner> as timely::scheduling::Schedule>::schedule
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/progress/subgraph.rs:295:17
  22: timely::worker::Wrapper::step::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/worker.rs:496:57
  23: core::option::Option<T>::map
             at /private/tmp/rust-20240503-6621-3k19qi/rustc-1.78.0-src/library/core/src/option.rs:1073:29
  24: timely::worker::Wrapper::step
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/worker.rs:496:26
  25: timely::worker::Worker<A>::step_or_park
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/worker.rs:234:38
  26: timely::execute::execute::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely-0.11.1/src/execute.rs:206:15
  27: timely_communication::initialize::initialize_from::{{closure}}
             at /Users/jonwhit/.cargo/registry/src/index.crates.io-6f17d22bba15001f/timely_communication-0.11.1/src/initialize.rs:269:33
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread caused non-unwinding panic. aborting.

It seems the timely-dataflow library violates some safe memory access polcies that newer versions of Rust may enforce as compared to previous versions of Rust as of when the article was written 🤷 ? What version of Rust should these examples be runnable on, and do you have any recommendations for how to resolve this?

I'm generally quite new to Rust, so that's part of this experience as well, and it may well be an issue here 😄

frankmcsherry commented 2 months ago

Yeah, it's hard to know. Rust doesn't document (afaik) what the preconditions of unsafe methods are, or .. they change the documentation. The good news is that as of two days ago (https://github.com/TimelyDataflow/differential-dataflow/pull/481) the unsafe code in consolidate_updates was removed, so you shouldn't see this.

Though! Looking at the stack trace, it seems like you might be using a very old version (0.11) rather than only an old version (0.12). In 0.12, the code uses std::ptr::swap, rather than the std::mem::swap that shows up on your stack trace, and ptr::swap is documented as allowing overlap.

I'll try out the quick start and see if I can tease out what is pointing you at the very old version of the code, and try and get this fixed. Thank you for the report!

frankmcsherry commented 2 months ago

So, the glitch is that the quick start tells you to put this in your Cargo.toml:

[package]
name = "my_project"
version = "0.1.0"
authors = ["Your Name <your_name@you.ch>"]

[dependencies]
timely = "0.11.1"
differential-dataflow = "0.11.0"

and .. that's where you are picking up the several-years-old version of the code. :D If you change both to 0.12.0, probably should all work. Better yet (imo) pointing at the github repo might be the better approach (0.12 is several years old, though crates.io doesn't have anything newer). I'll ponder today what the "right" thing to do is, and get on that.

Thanks again for the report!

frankmcsherry commented 2 months ago

Should be fixed up in the linked PR. Also, a few other minor glitches fixed that might have caused you to get stuck.