Open dgkf opened 1 year ago
And just to give a short minimal (partial) implementation, this is effectively the interface that I'm trying to implement. The goal is that vectors can be iterated over, providing references to their elements.
use std::ops::Add;
pub trait Atomic: Add + std::fmt::Debug + Clone + Sized {}
impl Atomic for i32 {}
pub enum RVec<'i, T: Atomic + 'i> {
Iter(Box<dyn Iterator<Item = &'i T>>)
}
impl Add<RVec<'r, RT>> for RVec<'l, LT> {
type Output;
fn add(self, rhs: RVec<'r, RT>) -> Self::Output {
}
}
impl<'i, T: Atomic, I> From<I> for RVec<'i, T> where I: IntoIterator<Item = T> {
fn from(value: I) -> Self {
}
}
fn main() {
let x = RVec::from(vec![1, 2, 3]);
let y = RVec::from(vec![4, 5, 6]);
let z = x + y;
}
Still hitting my head against this problem, so I've been going back to the drawing board and researching any patterns that might help steer me in the right direction. I turned up one that seemed immediately relevant. As I discover more I'll add them to the list.
I think I've settled on a good way to represent this, but I'm struggling to get it all to work. So far my attempt boils down to trying to solve this lifetimes problem (playground):
One solution is to create a new trait for types that can be used to repeatedly make an iterator. Here's a quick and (very) dirty mockup based on your example. I ended up removing the Rc
s and RefCell
s, not sure if you actually need interior mutability and reference counting.
Do you think something like this would work?
Thanks for taking a look @JoJoJet! Although I don't think the link you provided is capturing your changes. Can you generate a new permalink using the share button at the top of your playground page?
Oops 😆
Here's the updated link https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8e3cf309fae9f34ae8e47f91b95541f9.
Awesome! Thanks for getting something working. It will take me more time to really test this implementation and make sure it will meet all the design goals I’m going for - hopefully tonight I can poke away at it.
Even without digging through the code in detail, it’s already refreshing to see a different framing of the problem that seems to model the behaviors better than what I had.
I think there's still something I'm looking to achieve that isn't covered by this solution, but I think that's on me for over-simplifying my little example.
The idea behind using the closures in the playground attempt was to lazily produce iterators over the source values. What's more, iterators can be lazily created from iterators. What we end up with is a lazy evaluation scheme.
The interface I'm ultimately looking for looks something like this in pseudocode for vectorized addition:
x = [1, 2, 3] # materialized vector inputs
y = [5, 6, 7]
z = &x + &y # z is now an iterator of .map(|(xi, yi)| xi + yi)
w = &z + &y # w is now an iterator of .map(|(zi, yi)| zi + yi)
w.collect() # w is materialized into a vector
# [11, 14, 17]
This is where the lifetimes got a bit complicated, because w
requires that x
and y
outlive it, because their values haven't yet been iterated over until we materialize w
. Doing it from within an Add
implementation
I still think you're onto something with structuring the data a bit more instead of just using a mess of dyn
types.
Ah I see, I suspected it was more complicated than it appeared. I might play with this further to see if I can come up with something. Having a concrete example like that to emulate will be useful.
Yeah, sorry for the over-simplified example. I was trying to minimally reproduce the error that I needed to solve instead of putting the target interface first. You were totally right to restructure the problem in a more sensible way, I just mislead you by asking the wrong question.
I've still taken some inspiration from your approach and tried to be more mindful about structuring things a bit more. Even so, my investigation at every turn runs into stack overflow posts that tell me this isn't possible, so I think I must be thinking about the problem wrong.
My current thinking is that it might be better to "invert" the problem as described in this post, but then I think I need to have a method per -arity of the operations. As long as everything boils down to a unary or binary operator, maybe this is acceptable.
Since this repo seems to have gathered a bit of visibility, I just want to file an issue for what I've been experimenting with recently.
One of the impressive enhancements that R has received is the idea of internal "altreps" for vectors. This avoids allocating huge vectors for things like ranges, which can be completely described by a start, end and increment (
lobstr::obj_size(1:2) == lobstr::obj_size(1:1e12)
).Internally within rust, I'd like to have a few altreps:
Vector
:IntoIterator
:seq
were built as primitives.Subset
:x[[1]][[1]][[1]] <- 3
) to avoid excessive allocations as each layer of indexing reallocates a new vector.To sort of map out an ideal workflow in my eyes, this is how I'd like to see the evaluation of vectors:
I've been writing a series of small experiments to try to build this in a way that covers these use cases. The
Vector
andSubset
use cases are well defined, but I'm finding theIntoIterator
case to be a bit challenging.I think this is an important internal concept to nail down early, so I'm taking my time to do lots of experiments.