Amanieu / intrusive-rs

Intrusive collections for Rust
Apache License 2.0
412 stars 48 forks source link

Support for Generic Reference Types #46

Closed amari closed 4 years ago

amari commented 4 years ago

All changes currently live in the custom_links module.

At a glance:

TODO:

Solves #27

Amanieu commented 4 years ago

Is it really necessary to expose NodeRef as a configurable type? Can't we just use a raw pointer instead? I would prefer keeping NodeRef as an internal detail.

amari commented 4 years ago

It's only necessary if a Link implementation cannot use raw pointers, which only happens when trying to minimize the size of Link types.

For example, if all objects are backed by an arena/slab, and the cost of converting NodeRef into *const A::Link is negligible for the application (e.g. shift and add), a u8 or u16 can be used instead, saving 18 to 21 bytes (75% to 87%) per RBTreeLink.

I already planned to move the methods marked #[doc(hidden)] out of NodeRef, would that be satisfactory?

Amanieu commented 4 years ago

How about something like this:

trait Link {
    type LinkPtr;
    fn next(&self) -> LinkPtr;
    fn set_next(&self, ptr: LinkPtr);
}

The Adapter can take care of converting between LinkPtr and *const Link. The default RawLink just has type LinkPtr = *const Link.

The NodeRef (previously called NodePtr) should stay a private type.

amari commented 4 years ago

Shouldn't set_next be unsafe? As per your above comments, it should be.

trait Link {
    type LinkPtr;
    fn next(&self) -> Self::LinkPtr;
    unsafe set_next(&self, next: Self::LinkPtr);
}

Also, struct NodePtr was refactored into trait NodeRef and struct RawLinkRef. Did you want me to revert this?

Amanieu commented 4 years ago

Actually, I'd like to take a few hours to study what the best design for this is. I will get back to you later today.

Amanieu commented 4 years ago

OK, so here's the design I came up with:

// adapter.rs

pub unsafe trait Adapter {
    /// Collection-specific link type which allows an object to be inserted in
    /// an intrusive collection.
    type LinkOps: LinkOps;

    /// Object type which is inserted in an intrusive collection.
    type Value: ?Sized;

    /// Pointer type which owns an instance of a value.
    type Pointer: IntrusivePointer<Self::Value>;

    /// Gets a reference to an object from a reference to a link in that object.
    unsafe fn get_value(&self, link: <Self::LinkOps as LinkOps>::LinkPtr) -> *const Self::Value;

    /// Gets a reference to the link for the given object.
    unsafe fn get_link(&self, value: *const Self::Value) -> <Self::LinkOps as LinkOps>::LinkPtr;

    fn link_ops(&self) -> &Self::LinkOps;

    fn link_ops_mut(&mut self) -> &mut Self::LinkOps;
}

// Base trait for link operations
pub trait LinkOps {
    type LinkPtr: Copy;
    fn is_linked(&self, ptr: Self::LinkPtr) -> bool;
    unsafe fn mark_unlinked(&mut self, ptr: Self::LinkPtr);
}

// singly_linked_list.rs

// Generic LinkOps interface for SinglyLinkedList
pub unsafe trait SinglyLinkedListOps: super::LinkOps {
    fn next(&self, ptr: Self::LinkPtr) -> Option<Self::LinkPtr>;
    unsafe fn set_next(&mut self, ptr: Self::LinkPtr, next: Option<Self::LinkPtr>);
}

pub struct Link {
    next: Cell<*const Link>,
}

// Default LinkOps implementation for SinglyLinkedList, which uses Link.
pub struct LinkOps;
impl super::LinkOps for LinkOps {
    type LinkPtr = *const Link;
    fn is_linked(&self, ptr: Self::LinkPtr) -> bool { ... }
    unsafe fn mark_unlinked(&mut self, ptr: Self::LinkPtr) { ... }
}
impl SinglyLinkedListOps for LinkOps {
    fn next(&self, ptr: Self::LinkPtr) -> Option<Self::LinkPtr> { ... }
    unsafe fn set_next(&mut self, ptr: Self::LinkPtr, next: Option<Self::LinkPtr>) { ... }
}

This design is heavily inspired by Boost.Intrusive's support for custom value traits.

Making LinkOps a separate trait from Adapter has several advantages:

amari commented 4 years ago

Then is NodePtr even necessary?

pub struct Link {
    // Why not `Cell<*const Self>`?
    next: Cell<NodePtr>,
}

If its non-accessor methods are refactored into generic free functions over LinkOps, and LinkOps implements the accessors, what should NodePtr contain?

Amanieu commented 4 years ago

You are right, NodePtr isn't needed any more. Some of the larger functions (particularly in RBTree) can be replaced with free functions, others can just be inlined directly.

By the way, I've made some minor changes to the design above:

Amanieu commented 4 years ago

I guess we could move the from_raw and into_raw of IntrusivePointer into Adapter, which would allow stateful conversions. Pointer is supposed to represent and owned value which is inserted into or taken out of a collection.

So this leaves us with 3 types:

amari commented 4 years ago

Also LinkOps needs a ptr_eq method because RBTree and LinkedList need pointer equality.

pub trait LinkOps {
    type LinkPtr: Copy;
    //...
    unsafe fn ptr_eq(&self, this: Self::LinkPtr, other: Self::LinkPtr) -> bool;
}

This follows the pattern of Rc and Arc, where the definition of pointer equality is independent of whether or not LinkPtr implements Deref and Deref::Target implements Eq.

Amanieu commented 4 years ago

Would it be enough to just have LinkPtr: Copy + Eq? Or can you think of a situation where this wouldn't be enough?

amari commented 4 years ago

I fixed my above comment. ptr_eq should be preferred to Copy only if we allow types that implement Deref to be LinkPtr, which is why I referenced Rc and Arc.

I believe only a rather fervent "safety purist" would ever attempt to implement LinkOps<LinkPtr = Pin<Arc<MyLink>>> instead of using the default LinkOps and from_raw/into_raw.

Every other case I can think of is covered already:

Amanieu commented 4 years ago

I see. I think that if you really wanted to use Rc or Arc then you could just make a wrapper around them which implements Eq using ptr_eq. I don't expect anyone to actually do this in practice though.

amari commented 4 years ago

Ok.

SinglyLinkedList is done except for intrusive_adapter!.

I think that from_raw and into_raw should not be moved into the Adapter trait, and instead should be members of a PointerOps<Pointer> trait.

Advantages of not including from_raw and into_raw in Adapter include:

// adapter.rs
pub unsafe trait Adapter {
    /// Collection-specific link type which allows an object to be inserted in
    /// an intrusive collection.
    type LinkOps: LinkOps;

    /// Object type which is inserted in an intrusive collection.
    type Value: ?Sized;

    /// Pointer type which owns an instance of a value.
    type Pointer;

    /// Collection-specific pointer conversions which allow an object to
    /// be inserted in an intrusive collection.
    type PointerOps: PointerOps<Self::Pointer, Value = Self::Value>;

    /// Gets a reference to an object from a reference to a link in that object.
    unsafe fn get_value(&self, link: <Self::LinkOps as LinkOps>::LinkPtr) -> *const Self::Value;

    /// Gets a reference to the link for the given object.
    unsafe fn get_link(&self, value: *const Self::Value) -> <Self::LinkOps as LinkOps>::LinkPtr;

    fn link_ops(&self) -> &Self::LinkOps;

    fn link_ops_mut(&mut self) -> &mut Self::LinkOps;

    fn pointer_ops(&self) -> &Self::PointerOps;
}

// pointer_ops.rs

pub unsafe trait PointerOps<Pointer> {
    type Value: ?Sized;

    /// Constructs an owned pointer from a raw pointer.
    /// 
    /// # Safety
    /// The raw pointer must have been previously returned by `into_raw`.
    unsafe fn from_raw(&self, value: *const Self::Value) -> Pointer;

    /// Consumes the owned pointer and returns a raw pointer to the owned object.
    fn into_raw(&self, ptr: Pointer) -> *const Self::Value;
}

/// The default implementation that is used when `PointerOps`
/// is not specified in the invocation of `intrusive_adapter!`.
pub struct IntrusivePointerOps;
unsafe impl<'a, T: ?Sized> PointerOps<&'a T> for IntrusivePointerOps {
    type Value = T;

    #[inline]
    unsafe fn from_raw(&self, raw: *const Self::Value) -> &'a T { ... }

    #[inline]
    fn into_raw(&self, ptr: &'a T) -> *const Self::Value { ... }
}
unsafe impl<T: ?Sized> PointerOps<UnsafeRef<T>> for IntrusivePointerOps {
    type Value = T;

    #[inline]
    unsafe fn from_raw(&self, raw: *const Self::Value) -> UnsafeRef<T> { ... }

    #[inline]
    fn into_raw(&self, ptr: UnsafeRef<T>) -> *const Self::Value { ... }
}
#[cfg(feature = "alloc")]
unsafe impl<T: ?Sized> PointerOps<Box<T>> for IntrusivePointerOps {
    type Value = T;

    #[inline]
    unsafe fn from_raw(&self, raw: *const Self::Value) -> Box<T> { ... }

    #[inline]
    fn into_raw(&self, ptr: Box<T>) -> *const Self::Value { ... }
}
#[cfg(feature = "alloc")]
unsafe impl<T: ?Sized> PointerOps<Rc<T>> for IntrusivePointerOps {
    type Value = T;

    #[inline]
    unsafe fn from_raw(&self, raw: *const Self::Value) -> Rc<T> { ... }

    #[inline]
    fn into_raw(&self, ptr: Rc<T>) -> *const Self::Value { ... }
}
#[cfg(feature = "alloc")]
unsafe impl<T: ?Sized> PointerOps<Arc<T>> for IntrusivePointerOps {
    type Value = T;

    #[inline]
    unsafe fn from_raw(&self, raw: *const Self::Value) -> Arc<T> { ... }

    #[inline]
    fn into_raw(&self, ptr: Arc<T>) -> *const Self::Value { ... }
}
Amanieu commented 4 years ago

I generally like the idea of PointerOps, but would make a few small changes:

// adapter.rs
pub unsafe trait Adapter {
    /// Collection-specific link type which allows an object to be inserted in
    /// an intrusive collection.
    type LinkOps: LinkOps;

    /// Collection-specific pointer conversions which allow an object to
    /// be inserted in an intrusive collection.
    type PointerOps: PointerOps;

    /// Gets a reference to an object from a reference to a link in that object.
    unsafe fn get_value(&self, link: <Self::LinkOps as LinkOps>::LinkPtr) -> *const <Self::PointerOps as PointerOps>::Value;

    /// Gets a reference to the link for the given object.
    unsafe fn get_link(&self, value: *const <Self::PointerOps as PointerOps>::Value) -> <Self::LinkOps as LinkOps>::LinkPtr;

    fn link_ops(&self) -> &Self::LinkOps;

    fn link_ops_mut(&mut self) -> &mut Self::LinkOps;

    fn pointer_ops(&self) -> &Self::PointerOps;
}

// pointer_ops.rs

pub unsafe trait PointerOps {
    type Value: ?Sized;
    type Pointer;

    /// Constructs an owned pointer from a raw pointer.
    /// 
    /// # Safety
    /// The raw pointer must have been previously returned by `into_raw`.
    unsafe fn from_raw(&self, value: *const Self::Value) -> Self::Pointer;

    /// Consumes the owned pointer and returns a raw pointer to the owned object.
    fn into_raw(&self, ptr: Self::Pointer) -> *const Self::Value;
}

/// The default implementation that is used when `PointerOps`
/// is not specified in the invocation of `intrusive_adapter!`.
pub struct DefaultPointerOps<Pointer>;
unsafe impl<'a, T: ?Sized> PointerOps for DefaultPointerOps<&'a T> {
    type Value = T;
    type Pointer = &'a T;

    #[inline]
    unsafe fn from_raw(&self, raw: *const T) -> &'a T { ... }

    #[inline]
    fn into_raw(&self, ptr: &'a T) -> *const T { ... }
}

Note that the current specification of intrusive_adapter! doesn't have to change. We just need to add Ops as an associated type to Link which the macro will automatically pick up.

intrusive_adapter!(MyAdapter = Box<Test>: Test { link: LinkedListLink })

impl LinkedListLink {
    pub type Ops = LinkOps;
}
amari commented 4 years ago

Apparently, associated types are not supported in inherent impls (see rust-lang/rust#8995).

A DefaultLinkOps trait would suffice:

// link_ops.rs

pub trait DefaultLinkOps {
    type Ops: LinkOps + Default;
}

// linked_list.rs

impl DefaultLinkOps for Link {
    type Ops = LinkOps;
}
Amanieu commented 4 years ago

Sounds good. Can you start pushing your changes so that I can start reviewing the implementation?

Amanieu commented 4 years ago

Thanks a lot for your hard work getting this feature implemented @amari!