Open glaebhoerl opened 9 years ago
Its already kinda possible to create unmovable types by borrowing self, see this as an example: https://gist.github.com/Kimundi/d846f1da490ef4e9b0d9
Or here a bit more integrated solution: https://gist.github.com/Kimundi/81b9944a896052b5407f
I think what is desired here is not just the ability to mark types as not moveable, but to actually control the process of moving. I was thinking that this could be accomplished with @Kimundi's scheme if we require &'a mut self
for the move, since that would lock the data forever, but then you can't move if you have updated, since the update borrow also lasts forever, &c.
@Kimundi
Sure you can have non-movable types by not giving the user &mut
references, but then you lose the benefits of ownership. What would help is user types that behave like &mut T
- i.e. can be reborrowed, but don't allow moving the inner type.
I don't think a compiler-supported but user-defined Relocate
trait would help – non-trivial move constructors are terrible. A Sized?
-style explicit-waive Move
could work, however.
@arielb1 You could just have types which do non-trivial moves such as a Vec resize do a needs_move::<T>()
check, which the compiler could optimize out after monomorphization.
I imagine we would introduce a new rule: types are either Copy or Move, and there is a blanket impl of Move for all non-Copy types which is just memcpy.
@arielb1, @reem Relocate
wouldn't be compiler-supported (at least not in the sense of being implicitly invoked, which I'm assuming you meant), just a normal trait like Clone
. The critical part is having nonmoveable types, and the Relocate
trait is just something we could add once we do. (This is what I personally meant by what I wrote - of course other people may have different ideas.)
What I meant by Move
was implicit movability, a la Copy
. In other words Relocate
is to Clone
as Move
is to Copy
. But requiring explicit Move
bounds might be onerous.
@Kimundi That's interesting and clever... I wonder if it's adequate for intrusive structures like IntListNode
? But even if it currently is, I suspect that it won't be once we have new destructor semantics. I'm pretty sure doing this trick on types with destructors (as IntListNode
would be) is precisely what would be forbidden - things with lifetimes you store in it must have lifetime strictly greater than the thing itself. (But maybe you could still do it with unsafe code?)
@reem
Non-trivial move is a nightmare to work with, because it inserts user-defined operations to rather arbitrary places. A Move
bound that works like Sized
would work pretty well.
It strikes me that for an intrusive doubly-linked list type like IntListNode
above to work well with the type system (i.e. not lead to undefined behavior), &mut
borrows of it likely also have to be ruled out. The presentation above was simplified for exposition, but suppose we had value: Cell<int>
rather than value: int
, and allowed traversing the list and getting &Cell<int>
references to the contained values. But this is not enough for safety, because we cannot rule out the possibility that another node has been &mut
borrowed, and having &mut
and &
references to the same thing at the same time is undefined.
This is eerily similar to the restrictions on global static
s: no moving, no &mut
borrowing. It's probably not a coincidence: in both cases we lack the ability to statically keep track of potential references.
These restrictions also happen to correspond to the restrictions on things which are &
-borrowed... which again makes sense: we essentially have the presumption that they are always shared.
Some more thoughts:
First, the fact that we seem to want essentially the same restrictions as which an &
borrow has means that Kimundi's technique is a better fit than I thought at first. But is there any way to "lock" the variable from the get-go, or to work around the fact that it can be moved beforehand, without some kind of ugly runtime hacks?
This also reveals a deeper conundrum. Shared borrow restrictions prevent moving, even via relocate()
. But this is for a reason! When is it safe to relocate()
? If there can exist shared references to a given IntListNode
at any time without you knowing about it, by traversing from other nodes, then it is never (provably) safe.
But it seems like being able to explicitly move is a capability you would want sometimes?
I don't have a great deal of experience with intrusive data structures. Can someone who does perhaps shed some light on how this sort of thing is usually reasoned about in other languages?
What would help is user types that behave like &mut T - i.e. can be reborrowed, but don't allow moving the inner type.
Like this? https://gist.github.com/pythonesque/d002384c29f83f9efd09
Pretty sure NoMoveGuard is unmovable in safe code.
_Edit:_ https://gist.github.com/pythonesque/d002384c29f83f9efd09#file-intlist-rs proof of concept.
@pythonesque Just a note, but line 15 is undefined behavior due to undefined struct layout. You should construct it explicitly then transmute the lifetime only.
@reem: The NoMoveGuard is zero sized and markers have no alignment / size requirements. Is the behavior not defined in this context?
(Also, the lifetime doesn't actually change).
I'm almost 100% sure that struct layout is always undefined, i.e. even going from T -> NewType(T) through transmute is UB.
Okay, updated to remove UB: https://gist.github.com/pythonesque/d002384c29f83f9efd09.
_Edit_ I'm not quite sure what about this requires completely immovable types: https://gist.github.com/pythonesque/d002384c29f83f9efd09#file-intlist2-rs is basically identical and doesn't use any unsafe code. So maybe I'm approaching this incorrectly.
For what it's worth, I've done a lot of work with movable intrusive types. Generally in these cases, I use the intrusive structure to manage ownership of its container. That is, moving the intrusive structure moves the container at the same time. For example, if I use a linked-list to pass ownership of a structure from one thread to another, then I can use an intrusive link on the list to represent ownership of the containing structure. I'm not sure if I expressed that idea clearly, but it's a hope of mine that Rust will support code like that safely.
I went with the IntListNode
example here because I thought it was simpler, but it's possible that this wasn't actually such a hot idea, and that this example can actually be expressed with lifetimes as they exist, as @pythonesque shows.
The use case which actually made me start thinking about intrusive data structures and unmoveable types was "a weak reference for any type" (not just Rc
): you'd have a WeakEnabled<T>
, from which you can form WeakRef<T>
s, which has a method something like fn get(self: &WeakRef<T>) -> Option<&T>
. And it's implemented (potentially) by threading a doubly-linked list through the WeakEnabled
and all of its WeakRef
s; when a WeakRef
is moved or dropped (or cloned), the list is patched up around it in the obvious way; each WeakRef
additionally contains a nullable pointer back to the WeakEnabled
; and when the WeakEnabled
is dropped, its destructor traverses the list and sets all of the back-pointers in the WeakRef
s to null (after which get()
will return None
). Which sounds like a splendid plan right up until you realize that both the WeakEnabled
and the WeakRef
s can be moved at any time, invalidating the links. Perhaps this example can serve as better motivation than the previous one did?
@aidancully Unfortunately I don't understand what you're trying to say, even though I'd like to. What do you mean by "the intrusive structure" and "its container"? What does it mean to use a linked list to "pass ownership"?
@glaebhoerl I'm going to try to express the idea in C, because C has well-understood semantics in this domain... Consider the following C code:
typedef struct _Link Link;
struct _Link {
Link *Next;
};
typedef struct _List List;
struct _List {
Link *Head;
Link *Tail;
};
In traditional usage, the Link
type is completely useless. You might be able to use it to represent Church numerals, but no other applications immediately occur to me. On the other hand, if you consider Link
to be an intrusive structure inside of a container structure, like so:
typedef struct _Container Container;
struct _Container {
/* some fields belonging to the container */
int Field1;
int Field2;
/* allow this structure to be placed on a linked-list */
Link MLink;
};
Then it becomes possible to create linked-lists of Container
using the single implementation defined for Link
. (Going back to your questions, MLink
is the intrusive structure in this example, and Container
is its container.)
What I mean by "passing ownership" is, consider the case that one thread reads from List
while another thread writes to List
. If both threads have agreed that elements of List
are MLink
fields from Container
, then it's possible for the sending thread to give ownership of a Container
object to a receiving thread using code like the following:
/* on "sender" side: */
void send_to_thread2(List *ContainerList, Container *C) {
append_to_list(ContainerList, &C->Link);
}
/* on "receiver" side: */
Container *receive_from_thread1(List *ContainerList) {
Link *Head = pop_head_from_list(ContainerList);
if(Head == 0)
return 0;
return containerof(Container, MLink, Head);
}
In Rust terms, ownership of the Container
structure moved from thread 1 by appending the Link
structure (directly embedded within the Container
) onto a List
. On the other hand, ownership of the Container
is moved into thread 2 by popping this embedded Link
structure off of the List
. Ownership of a Container
is communicated between threads through passing ownership of Mlink
.
Obviously I've left out a lot of details for ensuring safety when using this sort of pattern, but I think this should express the idea more clearly than the single paragraph you referenced earlier. I also think this might be a more traditional use of intrusive structures than what you're describing. AFAICT, the Rust ownership system doesn't handle this sort of usage (representing ownership of a container via ownership of a contained field) at all, right now... I'm working on an RFC around the idea, would be happy to talk about it elsewhere. (To give a preview: build intrusive structure support off a "singleton type" construct, allowing a safe "containerof"-type operation. Still working on many details.)
Thanks
@glaebhoerl Would boxing the WeakRef
allow what you want?
@aidancully Possibly, but that's a workaround with a performance cost, and (AFAICT) will also stop working if/when we get something like DerefMove
(which, independently, I believe we do want to eventually gain).
(Still digesting your previous comment.)
I think the simplest meaningful example of a type where you really want something like non-moveability and &out
references might be a type of relatively-addressed (position-independent) references: neither Copy
nor even bitwise moves are valid, because the value necessarily has to depend on its position in memory, and should only be cloned using code which explicitly preserves the invariant. This example is also edifyingly free of any entanglement with the idea of references-into-self.
You'd want an API something like this:
pub struct Relative<'a, T> { ptrdiff: isize }
impl<'a, T> !Move for Relative<'a, T> { }
impl<'a, T> Relative<'a, T> {
pub fn make(from: &'a T, into: &out Relative<'a, T>) { ... }
pub fn deref(&'a self) -> &'a T { ... } // we can't actually impl Deref :\
}
impl<'a, T> Clone for Relative<'a, T> {
fn clone_into(&self, into: &out Self) { ... }
}
With the Clone
trait being modified to something like this:
pub trait Clone {
fn clone(&self) -> Self where Self: Move;
fn clone_from(&mut self, source: &Self) { ... }
fn clone_into(&self, into: &out Self) { ... }
}
(For RelativeMut
, you'd also want &move
and trait Relocate
.)
Is there any other way to support exposing a safe interface for a type like this?
I agree with @arielb1 that a "bound" analogous to ?Sized
(which I try to refer to as "generalization markers", though maybe I should shorten it to "generalizers") seems like it could be a good fit here.
E.g. something like a lang-item marker trait Immobile { }
and trait Mobile
that all types that don't implement Immobile
get by default. (Or we could allow the latter to be expressed via negative bounds -- that detail isn't important to me.)
But I have not yet attempted to encode the examples given the comment thread via such an approach; the idea hasn't gone beyond the level of "thought experiment" for me yet.
Related to this issue: rust-tenacious has been retired so there's now no available lint for flagging that you've misused an unmovable type.
@aidanhs It was pretty bad at its job though, there's no way to know what happens without looking at the MIR, preferably after MIR optimizations (although they shouldn't introduce more moves).
However, doing it on the MIR may be as simple as writing a MIR visitor that checks the type of every consumed lvalue and errors if it's unmovable.
Since rust-tenacious was discontinued, access to MIR has been streamlined and nowadays it should be possible to write such a lint.
We now have pinning. Does that mean we can close this issue?
It seems at least it should be updated to say what is still missing.
I think we're still missing &out
references for safely moving !Unpin
things around while respecting their invariants?
I don't think we need &out
for a safe interface to intrusive data structures.
I think what is confusing me is that the issue conflates two very different concerns: Intrusive data structures (possible with Pin
and without &out
), and unmoveable types (may need more than that, but we don't even have a compelling motivation justifying their complexity). The issue makes it sound like these two are the same problem, but they are not.
safely moving !Unpin things around while respecting their invariants?
There is no such thing. Their invariant explicitly says "though shall not move me around".
@RalfJung The reason it conflates them is that back in 2014 the issue was not well understood (at least by me).
There is no such thing. Their invariant explicitly says "though shall not move me around".
Possibly what @Ekleog meant to express is "transfer ownership without physically moving (and without incurring heap allocation)"?
(Side note &out
is unfortunately widely used to mean both one thing and its direct opposite, making it worse than useless for purposes of discussion (in particular I can only guess which of them was meant now) -- less ambiguous names might be &own
resp. &uninit
/&needsinit
.)
The reason it conflates them is that back in 2014 the issue was not well understood (at least by me).
Agreed. That's why I was suggesting in the light of what we know now, the issue title + description needs updating, or the issue should be closed (and potentially a new one opened for what is still deemed missing).
(For the record the reason I didn't express any opinion or preference on that is because I don't have one.)
safely moving !Unpin things around while respecting their invariants?
There is no such thing. Their invariant explicitly says "though shall not move me around".
Well, if the !Unpin thing does implement a ReallyMove trait that has a
function like (pseudo-syntax and just for the idea, I haven't thought
about the exact type esp. of out
):
fn really_move(self: Pin<&mut Self>, out: Pin<&needsinit Self>)
Then the thing could be safely moved.
I can imagine this be useful for eg. stack-linked-lists, where you would
want to move a node, but then you need to re-attach the now-dangling
links. The really_move
function would allow to do this
moving-and-reattaching behaviour.
Moving and reattaching might be better done via a method that unlinks the node and un-Pin
s it, and then another method that inserts it.
I'm not sure, though, how widely applicable that pattern would be among "pinned objects that want to move," or even if it's allowed by Pin
's guarantees (anything that sees the Pin
can assume it will be dropped before being freed/moved, but I suspect privacy is enough to make this sound).
Rust currently has very poor support for intrusive data structures: it is extremely difficult or impossible to define a safe interface for them. This is because all types in Rust are moveable, and moving invalidates any pointers into the given value. Rust's current assumption is that values only have pointers pointing out from them, and any pointers into them are known statically and have their validity enforced at compile time by borrow checker restrictions. But this fails to accomodate the case where the validity of pointers into values is maintained by runtime code, as is the case for intrusive data structures.
We would like to be able to have a type such as:
and allow client code to deal in instances of
IntListNode
directly and store them wherever, such as on the stack, with e.g. theDrop
impl written to patch upprev.next
andnext.prev
. The implementation would almost certainly requireunsafe
code, but we would like to at least expose a safe interface. This requires thatIntListNode
not be implicitly movable.We could also have a trait such as:
for explicit moves of normally nonmoveable types, using
&move
and&out
references to avoid the paradox of passing nonmoveable types by value. But we need nonmoveable types first. (relocate()
should be equivalent to aclone()
followed by adrop()
on the original, except likely more efficient, andclone()
itself may not be available.)An "interesting" question arising from nonmoveable types is how to acquaint them with the generics system. (Do we have a
Move
bound, a laCopy
? Do we require it to be explicitly specified? Or explicitly waived?)