TimelyDataflow / differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.
MIT License
2.51k stars 182 forks source link

Minimize `OwnedItem` bounds #499

Closed frankmcsherry closed 1 month ago

frankmcsherry commented 1 month ago

The PR attempts to minimize opinions on owned items, then references to owned items, and then a few more bounds. Roughly, almost no one cared about the specifics of BatchContainer::OwnedItem, which is great news because it is a source of annoyance. Once no one cares about this, also no one cares that BatchContainer::ReadItem implement IntoOwned, which is also great.

One casualty is option_container.rs, which ... is software architected around using the C::OwnedItem to drive its opinions on things, whereas otherwise idiomatic code has an owned type in mind but also the good graces to name it specifically rather than ask the batch container to be the one to remember. We could probably rewrite it to work, but it was only demo code in the first place, and the Huffman container should keep us honest about GATs.

antiguru commented 1 month ago

We're getting close to a blanket implementation for FlatStack:

index 6a0fe72c..46f08275 100644
--- a/src/trace/implementations/mod.rs
+++ b/src/trace/implementations/mod.rs
@@ -444,6 +444,7 @@ pub mod containers {
     use timely::container::PushInto;

     use std::borrow::ToOwned;
+    use timely::container::flatcontainer::{FlatStack, Push, Region};

     /// A general-purpose container resembling `Vec<T>`.
     pub trait BatchContainer: 'static {
@@ -584,6 +585,34 @@ pub mod containers {
         }
     }

+    impl<R> BatchContainer for FlatStack<R>
+    where
+        for<'a> R: Region + Push<<R as Region>::ReadItem<'a>> + 'static,
+        for<'a, 'b> R::ReadItem<'a>: Copy + Ord + PartialOrd<R::ReadItem<'b>>
+    {
+        type ReadItem<'a> = R::ReadItem<'a>;
+
+        fn copy(&mut self, item: Self::ReadItem<'_>) {
+            self.copy(item);
+        }
+
+        fn with_capacity(size: usize) -> Self {
+            Self::with_capacity(size)
+        }
+
+        fn merge_capacity(cont1: &Self, cont2: &Self) -> Self {
+            Self::merge_capacity(std::iter::once(cont1).chain(std::iter::once(cont2)))
+        }
+
+        fn index(&self, index: usize) -> Self::ReadItem<'_> {
+            self.get(index)
+        }
+
+        fn len(&self) -> usize {
+            self.len()
+        }
+    }
+
     /// A container that accepts slices `[B::Item]`.
     pub struct SliceContainer<B> {
         /// Offsets that bound each contained slice.

But that fails with:

➜  differential-dataflow git:(minimize_owneditem_bounds) ✗ cargo check
    Checking differential-dataflow v0.12.0 (/home/moritz/dev/repos/differential-dataflow)
error[E0283]: type annotations needed: cannot satisfy `<R as timely::container::flatcontainer::Region>::ReadItem<'a>: PartialOrd`
   --> src/trace/implementations/mod.rs:593:29
    |
593 |         type ReadItem<'a> = R::ReadItem<'a>;
    |                             ^^^^^^^^^^^^^^^
    |
note: multiple `impl`s or `where` clauses satisfying `<R as timely::container::flatcontainer::Region>::ReadItem<'a>: PartialOrd` found
   --> src/trace/implementations/mod.rs:591:45
    |
591 |         for<'a, 'b> R::ReadItem<'a>: Copy + Ord + PartialOrd<R::ReadItem<'b>>
    |                                             ^^^   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: required for `<FlatStack<R> as BatchContainer>::ReadItem<'a>` to implement `Ord`
note: required by a bound in `BatchContainer::ReadItem`
   --> src/trace/implementations/mod.rs:452:35
    |
452 |         type ReadItem<'a>: Copy + Ord + for<'b> PartialOrd<Self::ReadItem<'b>>;
    |                                   ^^^ required by this bound in `BatchContainer::ReadItem`

For more information about this error, try `rustc --explain E0283`.
error: could not compile `differential-dataflow` (lib) due to 1 previous error

I'm not sure