rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
93.6k stars 12.05k forks source link

Tracking issue for Pin APIs (RFC 2349) #49150

Closed withoutboats closed 5 years ago

withoutboats commented 6 years ago

Tracking issue for rust-lang/rfcs#2349

Blocking stabilization:

Unresolved questions:

Edit: Summary comment: https://github.com/rust-lang/rust/issues/49150#issuecomment-409939585 (in the hidden-by-default part)

Nemo157 commented 6 years ago

I now notice that stack pinning is not part of the RFC, @withoutboats are you planning on releasing a crate for this, or should I just copy the example code into my crate that needs it?

withoutboats commented 6 years ago

@Nemo157 You should copy it and report your experience!

The unresolved question about leaking Unpin data relates to this. That API is unsound if we say you cannot overwrite Unpin data in a Pin unless the destructor runs, as @cramertj requested. There are other, less ergonomic, stack pinning APIs that do work for this. Its unclear what the right choice here is - is the ergonomic stack pinning API more useful or is the extra guarantee about leaking more useful?

One thing I'll note is that the stack pinning was not sufficient for things like Future::poll inside the await! macro, because it didn't allow us to poll in a loop. I'd be interested if you run into those issues, and how/if you solve them if you do.

Nemo157 commented 6 years ago

My current usage is pretty trivial, a single-threaded executor running a single StableFuture without spawning support. Switching to an API like @cramertj suggests would work fine with this. I have wondered how to extend this to allow spawning multiple StableFutures, but at least with my current project that's not necessary.

valff commented 6 years ago

Just tried to experiment with the API. Looks like the following (suggested by RFC) definition of Future is no longer object-safe?

trait Future {
    type Item;
    type Error;

    fn poll(self: Pin<Self>, cx: &mut task::Context) -> Poll<Self::Item, Self::Error>;
}
valff commented 6 years ago

Nevermind. Found a note about a plan to make arbitrary_self_types object-safe.

RalfJung commented 6 years ago

@withoutboats

One thing I'll note is that the stack pinning was not sufficient for things like Future::poll inside the await! macro, because it didn't allow us to poll in a loop.

Could you elaborate on that?

cramertj commented 6 years ago

@RalfJung You need Pin to support reborrows as Pin, which it currently does not.

RalfJung commented 6 years ago

@cramertj That sounds like a restriction of the Pin API, not of the stack pinning API?

cramertj commented 6 years ago

@RalfJung Yes, that's correct. However, PinBox can be reborrowed as Pin, while the stack-pinned type cannot (one borrow as Pin creates a borrow for the entire lifetime of the stack type).

RalfJung commented 6 years ago

Given a Pin, I can borrow it as &mut Pin and then use Pin::borrow -- that's a form of reborrowing. I take it that's not the kind of reborowing you are talking about?

cramertj commented 6 years ago

@RalfJung No-- methods like Future::poll were planned to take self: Pin<Self>, rather than self: &mut Pin<Self> (which isn't a valid self type since it isn't Deref<item = Self> -- it's Deref<Item = Pin<Self>>).

withoutboats commented 6 years ago

It might be the case that we could get this to work with Pin::borrow actually. I'm not sure.

RalfJung commented 6 years ago

@cramertj I did not suggest to call poll on x: &mut Pin<Self>; I thought of x.borrow().poll().

cramertj commented 6 years ago

@RalfJung Oh, I see. Yes, using a method to manually reborrow could work.

cramertj commented 6 years ago

Here's an example of this working, though it's a bit awkward.

Nemo157 commented 6 years ago

I'll try and remember to post an example of some of the stuff I'm doing with Pins next week, as far as I can tell reborrowing works perfectly. I have a pinned version of the futures::io::AsyncRead trait along with working adaptors like fn read_exact<'a, 'b, R: PinRead + 'a>(read: Pin<'a, R>, buf: &'b [u8]) -> impl StableFuture + 'a + 'b and I'm able to work this into a relatively complex StableFuture that's just stack pinned at the top level.

Nemo157 commented 6 years ago

Here's the full example of what I'm using for reading:

pub trait Read {
    type Error;

    fn poll_read(
        self: Pin<Self>,
        cx: &mut task::Context,
        buf: &mut [u8],
    ) -> Poll<usize, Self::Error>;
}

pub fn read_exact<'a, 'b: 'a, R: Read + 'a>(
    mut this: Pin<'a, R>,
    buf: &'b mut [u8],
) -> impl StableFuture<Item = (), Error = Error<R::Error>>
         + Captures<'a>
         + Captures<'b> {
    async_block_pinned! {
        let mut position = 0;
        while position < buf.len() {
            let amount = await!(poll_fn(|cx| {
                Pin::borrow(&mut this).poll_read(cx, &mut buf[position..])
            }))?;
            position += amount;
            if amount == 0 {
                Err(Error::UnexpectedEof)?;
            }
        }
        Ok(())
    }
}

This is slightly annoying as you have to pass instances through everywhere as Pin and use Pin::borrow whenever you call functions on them.

#[async]
fn foo<'a, R>(source: Pin<'a, R>) -> Result<!, Error> where R: Read + 'a {
    loop {
        let mut buffer = [0; 8];
        await!(read_exact(Pin::borrow(&mut source), &mut buffer[..]));
        // do something with buffer
    }
}
Nemo157 commented 6 years ago

I just had a thought that I could impl<'a, R> Read for Pin<'a R> where R: Read + 'a to workaround having to pass values as Pin<'a, R> everywhere, instead I could use fn foo<R>(source: R) where R: Read + Unpin. Unfortunately that fails because Pin<'a, R>: !Unpin, I think it's safe to add an unsafe impl<'a, T> Unpin for Pin<'a, T> {} since the pin itself is just a reference and the data behind it is still pinned.

RalfJung commented 6 years ago

Concern: It seems likely that we want most types in libstd to implement Unpin unconditionally, even if their type parameters are not Pin. Examples are Vec, VecDeque, Box, Cell, RefCell, Mutex, RwLock, Rc, Arc. I expect most crates will not think about pinning at all, and hence only have their types be Unpin if all their fields are Unpin. That's a sound choice, but it leads to unnecessarily weak interfaces.

Will this solve itself if we make sure to implement Unpin for all libstd pointer types (maybe even including raw pointers) and UnsafeCell? Is that something we will want to do?

withoutboats commented 6 years ago

Will this solve itself if we make sure to implement Unpin for all libstd pointer types (maybe even including raw pointers) and UnsafeCell? Is that something we will want to do?

Yes, it seems like the same situation as Send to me.

RalfJung commented 6 years ago

A new question just occurred to me: When are Pin and PinBox Send? Right now, the auto trait mechanism makes them Send whenever T is Send. There is no a priori reason to do that; just like types in the shared typestate have their own marker trait for sendability (called Sync), we could make a marker trait saying when Pin<T> is Send, e.g. PinSend. In principle, it is possible to write types that are Send but not PinSend and vice versa.

withoutboats commented 6 years ago

@RalfJung Pin is Send when &mut T is Send. PinBox is Send when Box<T> is Send. I see no reason for them to be different.

RalfJung commented 6 years ago

Well, just like some types are Send but not Sync, you could have a type relying on "Once this method is called with Pin<Self>, I can rely on never being moved to another thread". For example, this could give rise to futures that can be sent around before being started for the first time, but then have to remain in one thread (much like they can be moved around before being started, but then have to remain pinned). I'm not sure if I can come up with a convincing example, maybe something about a future that uses thread-local storage?

Nemo157 commented 6 years ago

I just hit the lifetime issue mentioned by @Diggsey. I believe Pin<Option<T>> -> Option<Pin<T>> should be a safe operation but it doesn't appear possible to implement even using the current unsafe APIs, let alone what sort of API would be needed to make this safe code:

trait OptionAsPin<T> {
    fn as_pin<'a>(self: Pin<'a, Self>) -> Option<Pin<'a, T>>;
}

impl<T> OptionAsPin<T> for Option<T> {
    fn as_pin<'a>(self: Pin<'a, Self>) -> Option<Pin<'a, T>> {
        match *unsafe { Pin::get_mut(&mut self) } {
            Some(ref mut item) => Some(unsafe { Pin::new_unchecked(item) }),
            None => None,
        }
    }
}

(It's possible to workaround using transmute to force the lifetimes, but that makes me feel far too icky).

RalfJung commented 6 years ago

I'd like to add an unresolved question: Should we add back a shared pinned reference type? I think the answer is yes. See this post with discussion for further details.

RalfJung commented 6 years ago

I just read that futures 0.2 is not as final as I thought it was, so maybe it is still possible after all to rename Pin back to PinMut and add a shared version.

withoutboats commented 6 years ago

@RalfJung I read your blog post again more thoroughly to understand the changes you propose.

I think you've found a potentially compelling use case for having an immutable Pin variant, but I don't understand your comments about Deref and &Pin<T> <=> &&T. Even if Pin<T> can be cast down to &T safely, that doesn't make them equivalent, because &T cannot be cast to Pin<T>. I don't see a reason to make the safe conversion unsafe (by eliminating the safe Deref impl).

dylanede commented 6 years ago

Currently the map method on Pin has the signature

pub unsafe fn map<U, F>(this: &'b mut Pin<'a, T>, f: F) -> Pin<'b, U>

What is the reason that it isn't the following?

pub unsafe fn map<U, F>(this: Pin<'a, T>, f: F) -> Pin<'a, U>

As it stands, I can't transform a Pin of one type to a Pin of one of its fields without shortening the lifetime unnecessarily.

dylanede commented 6 years ago

Another problem with the map method is that it seems to be impossible to turn a Pin of a struct into two Pins, each of a different field of the struct. What is the correct way to achieve that?

Nemo157 commented 6 years ago

I've been using this macro for that:

macro_rules! pin_fields {
    ($pin:expr, ($($field:ident),+ $(,)?)) => {
        unsafe {
            let s = Pin::get_mut(&mut $pin);
            ($(Pin::new_unchecked(&mut s.$field),)+)
        }
    };
}
MicahChalmer commented 6 years ago

In a lot of the discussion around Pin there seems to be an assumption that "projecting" a pin onto a private field should be thought of as safe. I don't think that's quite true. For instance, the doc comment on map currently reads:

You must guarantee that the data you return will not move so long as the argument value does not move (for example, because it is one of the fields of that value), and also that you do not move out of the argument you receive to the interior function.

The guarantee that "the data you return will not move so long as the argument value does not move" is the correct description of the contract that a caller of map has to uphold. The parenthetical "(for example, because it is one of the fields of that value)" seems to imply that as long as you're just returning a reference to your own private field you're guaranteed to be safe. But that's not true if you implement Drop. A Drop impl is going to see a &mut self, even when other code has already seen a Pin<Self>. It can proceed to use mem::replace or mem::swap to move out of its fields, violating the promise made by an earlier "correct" use of map.

In other words: using a "correct pin projection" call to Pin::map (the call looks like unsafe { Pin::map(&mut self, |x| &mut x.p) }), with no other invocations of unsafe, one can produce unsound/undefined behavior. Here's a playground link demonstrating this.

This doesn't imply that anything is wrong with the current API. It demonstrates is that Pin::map should be marked as unsafe, which it already is. Nor do I think there's much of a danger of people accidentally tripping over this when implementing futures or the like--you really have to go out of your way to move out of your own fields in a Drop impl, and I doubt there are many practical reasons to want to do it.

But I do think the doc comment for map might want to mention this, and I also think it invalidates some ideas for extensions that I see in the RFC and surrounding discussions:

RalfJung commented 6 years ago

@withoutboats

Even if Pin can be cast down to &T safely, that doesn't make them equivalent, because &T cannot be cast to Pin.

Indeed, that is not a sufficient condition. They are equal in my model because in my earlier post, we made the following definiton:

Definition 5: PinBox<T>.shr. A pointer ptr and lifetime 'a satisfy the shared typestate of PinBox<T> if ptr is a read-only pointer to another pointer inner_ptr such that T.shr('a, inner_ptr)

(And, kind-of implicitly, the corresponding definition for Pin<'a, T>.shr.) Notice how PinBox<T>.shr depends on T.shr and nothing else. This makes PinBox<T>.shr exactly the same invariant as Box<T>.shr, which implies that &Box<T> = &PinBox<T>. Similar reasoning shows that &&T = &Pin<T>.

So, this is not a consequence of the API or contract written down in the RFC. It is a consequence of the model that just has three typestates: Owned, shared, pinned. If you want to argue against &&T = &Pin<T>, you have to argue for introducing a fourth typestate, "shared pinned".

RalfJung commented 6 years ago

@MicahChalmer

In a lot of the discussion around Pin there seems to be an assumption that "projecting" a pin onto a private field should be thought of as safe. I don't think that's quite true.

That's a very good point! Just to be clear, there is no issue with making the p field of Shenanigans public, is there? At that point, any client could write do_something_that_needs_pinning, and the intention of the RFC is to make that safe. (I don't know why the RFC specifically mentions private fields, my interpretation was always that it was supposed to work with all fields.)

It is interesting to see how this interacts with the interpretation of drop in my model. There I wrote that drop(ptr) has a precondition of T.pin(ptr). With this interpretation, the actual code at fault in your example would be the drop implementation! (Now I wonder why I did not notice this already when writing the post...) I think we do want to allow safe projections to fields eventually, and we really shouldn't provide drop with an &mut if the type does pinning. That's clearly bogus.

Is there any way we could either (a) make impl Drop unsafe if the type is !Unpin, or (b) change its signature to drop(self: Pin<Self>) if T: !Unpin? Both sound extremely far-fetched, but on the other hand both would solve @MicahChalmer's point, while preserving safety of field projections. (If only this was still pre-1.0 and we could change Drop::drop to just always take Pin<Self> ;) But of course at this point this is not a library-only solution any more. The sad part is that if we stabilize as-is, we can never have safe field projections.

withoutboats commented 6 years ago

@RalfJung So I am more interested in the practical questions (as you'd probably expect :wink:). I think there are two:

I don't see a reason to answer the first question negatively. I think you have re-opened the second question; I'm inclined to go back to two different kinds of pins (but not completely convinced). If we ever upgrade this to a language-level reference type, we'd have &pin T and &pin mut T.

It seems like no matter what we do your model probably needs a fourth type-state to accurately reflect the API invariants. I do not think converting an &&T to an &Pin<T> should be safe.

RalfJung commented 6 years ago

So I am more interested in the practical questions (as you'd probably expect wink).

Fair enough. ;) However, I think it is important that we have at least a basic understanding of what is and is not guaranteed around Pin once unsafe code enters the picture. Basic Rust had several years of stabilization to figure this out, now we are repeating this for Pin in a few months. While Pin is a library-only addition as far as syntax is concerned, it is a significant addition as far as the model is concerned. I think it is prudent that we document as exhaustively as possible what unsafe code is and is not allowed to assume or do around Pin.

While these concerns are theoretical right now, they will suddenly be very practical once we have the first unsoundness due to unsafe code making incompatible assumptions.


Concerning your second question:

It seems like no matter what we do your model probably needs a fourth type-state to accurately reflect the API invariants. I do not think converting an &&T to an &Pin should be safe.

That is what I expected.

If we want to be conservative, we could rename Pin to PinMut but not add PinShr (likely to be called Pin but I am trying to disambiguate here) for now, and declare that unsafe code can assume that a &PinMut<T> actually points to something pinned. Then we would have the option of adding Pin as a shared pinned reference later.

A practical reason for having PinShr has been brought up by @comex: it should be possible to provide a getter that goes from PinShr<'a, Struct> to PinShr<'a, Field>; if we use &PinMut for shared pins that doesn't work. I don't know how much such getters will be needed.


For your first question, there seem to be some strong arguments in favor: (a) the current plans for object-safe arbitrary self types, and (b) being able to safely use the vast amount of existing APIs on shared references when holding a PinShr (or PinMut).

It is unfortunate that we don't seem to have an easy way to similarly provide operations that can work both on &mut and PinMut; after all plenty of code working on &mut has no intention of using mem::swap. (This, as I see it, would be the main advantage of a !DynSized-based solution or something comparable: &mut would turn into a type for references that may or may not be pinned. We could have this as yet another library type in the Pin API, but that's rather pointless given that we already have all these &mut methods out there.)

There is one light argument against, which is types that want to do "interesting" things both for &T and PinMut<T>. That will be hard to do, not everything is possible and one has to be very careful. But I don't think that trumps the good arguments in favor.

In this case (i.e., with this impl Deref), PinShr<'a, T> should come with a safe method that turns it into a &'a T (preserving the lifetime).


And there is another concern I think we have to resolve: The rules for Pin::map and/or drop, in the light of what @MicahChalmer noticed above. We have two choices:

The only possible argument for the second option that I can see is that it may be the only one we can actually implement. ;) I think it is inferior in every possible way -- it makes &pin rather strange and unergonomic even if it becomes built-in one day, it is a footgun, it hinders compositionality.

There may be a way to achieve the first option, but it's not trivial and I have no idea how to make it backwards-compatibly: We could add an Unpin bound to Drop, and add a DropPinned that does not have the bound and where drop takes Pin<Self>. Maybe the Unpin bound on Drop could be enforced in a strange way where you can write the impl Drop for S, but this adds an implicit bound to S saying that it has to be Unpin. Probably not realistic. :/ (I guess this is also a point where the !DynSized-based approach works better -- it turns &mut T into "may or may not be pinned", keeping drop sound.)

withoutboats commented 6 years ago

@RalfJung @MicahChalmer I think its better to just document that if you move out of the field in the Drop impl, projecting to a Pin of that field elsewhere is unsound.

Indeed, its already the case today that (using unsafe code) you could move out of a field of a !Unpin type, and this is safe and well defined as long as you never project to a pin of that field elsewhere. The only difference with Drop is that the moving-out part only contains safe code. It seems to me that the notes on Pin::map need to change to note that it is not safe if you ever move out of that field, regardless of the Drop impl.

It must be safe to move out of the fields of a !Unpin type in some cases, because the generators will very likely move out of one of their fields when they return.

RalfJung commented 6 years ago

I think its better to just document that if you move out of the field in the Drop impl, projecting to a Pin of that field elsewhere is unsound.

This is the second option then, the one that makes &pin to a field permanently an unsafe operation. I think this is not a small change. This fundamentally changes what pub on a field means. When using a library type, I cannot know what it does in its drop impl, so I fundamentally have no way to obtain a pinned reference to that field.

For example, I wouldn't even be allowed to go from Pin<Option<T>> to Option<Pin<T>> unless Option explicitly states that it will never have a Drop doing anything "funny". The compiler cannot understand that statement, so while Option could provide an appropriate method to do this, doing the same with a match has to remain an unsafe operation.

The only difference with Drop is that the moving-out part only contains safe code.

But that' a huge difference, isn't it? We can put arbitrary rules on what unsafe code may or may not do, but not so for safe code.

It must be safe to move out of the fields of a !Unpin type in some cases, because the generators will very likely move out of one of their fields when they return.

I suppose in this case the field will be Unpin? So we could likely have a rule saying that Pin::mut for a public field of a foreign struct is fine if that field has type Unpin. Not sure how useful this is, but it's probably better than nothing.

cramertj commented 6 years ago

I want to quickly restate my confusion about &Pin<T> not providing more guarantees than &&T. &, &mut, and &pin respectively provide "shared access", "unique access", and "unique access to a value that won't be moved". Understanding &&pin as "shared access to a unique access to a type that won't be moved" tells you that the memory is shared (the uniqueness guarantee of &pin is cancelled out by the sharing of &), but you still retain the property that the type won't be moved, no?

RalfJung commented 6 years ago

I am not sure what you are asking or saying. Are you confused why I think "shared pinned" is a fundamental mode/typestate on its own?

The point is, "shared access" is not a thing I know how to define on its own. There are many different ways to share and coordinate the sharing, as is witnessed by the very different ways in which, e.g., Cell, RefCell, and Mutex share.

You can't just say you are sharing something ("cancelling the uniqueness guarantee" of something) you own and expect that statement to make sense. You have to say how you are sharing, and how you ensure that this cannot cause havoc. You can "share by making it read-only", or "share by only giving atomic access through synchronized loads/stores", or "share [within just one thread] by having this borrow flag coordinate which kind of access to hand out". One of the key points in RustBelt was realizing the importance of letting every type define for itself what happens when it gets shared.

I cannot think of a way to make "shared pinning" arise as the orthogonal composition of sharing and pinning. Maybe there is a way to define the notion of a "sharing mechanism" that could then be applied to either the owned or the pinned invariant to give rise to "(normal) shared" and "pinned shared", but I seriously doubt it. Also, as we have seen that falls flat for RefCell -- if RefCell does for the shared pinned invariant something similar to what it does for the just shared invariant, we certainly cannot justify that from & &pin RefCell<T> via &RefCell<T> (using Deref) via borrow_mut we can obtain a &mut reference that says that no pinning happened.

MicahChalmer commented 6 years ago

@RalfJung

We could add an Unpin bound to Drop, and add a DropPinned that does not have the bound and where drop takes Pin.

Is the definition of Drop really the problem here? Another way to think about it is to place the blame on mem::swap and mem::replace. These are the operations that let you move something you don't own. Suppose a T: Unpin bound were added to them?

For starters, that would fix the drop hole that I pointed out - my Shenanigans would fail to compile, and I don't think I could make it to violate the pin promises without another unsafe. But it would enable more than that! If it has become safe to get a &mut reference to a previously pinned value in drop, why limit it to just drop?

Unless I'm missing something, I think that this would make it safe to borrow a &mut reference from a PinMut<T> any time you want. (I'm using PinMut to refer to the what, in current nightly, is called Pin, to avoid confusion with the discussion about shared pins.) PinMut<T> could implement DerefMut unconditionally, rather than just for T: Unpin.

It is unfortunate that we don't seem to have an easy way to similarly provide operations that can work both on &mut and PinMut; after all plenty of code working on &mut has no intention of using mem::swap.

A DerefMut impl on PinMut would fix that, right? Code that cares about pinning, and requires PinMut, could call code that works on &mut safely and easily. The burden would be placed instead on generic functions that do want to use mem::swap--those would have to have an Unpin bound added to them, or use unsafe and take care not to violate the pin conditions.

Adding such bounds to swap and replace now would break backwards compatibility going back to the very first stable release. I don't see a realistic way to get there from here. But am I missing some other hole in thinking that it would have been the thing to do, if only this were known in the pre-1.0 days?

As it is, I don't see any better solution than what @withoutboats said - keep map unsafe, and put a message in its docs warning people not to move out of any field that was previously pinned in a drop impl.

We can put arbitrary rules on what unsafe code may or may not do, but not so for safe code.

Using unsafe always imposes rules on the surrounding safe code. The good news here is that as far as we know, if a pinnable struct has a method to project a pin to a private field, only its own drop impl can use that to violate the contract in safe code. So it's still possible to add such a projection and present users of that struct with a fully safe API.

RalfJung commented 6 years ago

Is the definition of Drop really the problem here?

Assigning blame is somewhat arbitrary, there are different things one could change to plug the soundness hole. But do we have agreement that changing Drop as I suggested would fix the issue?

Another way to think about it is to place the blame on mem::swap and mem::replace. These are the operations that let you move something you don't own. Suppose a T: Unpin bound were added to them?

Well, that would make &mut T generally safe to use for !Unpin types. As you observed, we wouldn't even need PinMut any more. PinMut<'a, T> in your proposal could be defined as &'a mut T, right? This is essentially the ?Move proposal that has been discarded previously due to backwards compatibility and language complexity issues.

Using unsafe always imposes rules on the surrounding safe code.

I am not sure what you mean. Beyond the privacy boundary, this must not be the case; unsafe code cannot impose anything on its clients.

The good news here is that as far as we know, if a pinnable struct has a method to project a pin to a private field, only its own drop impl can use that to violate the contract in safe code. So it's still possible to add such a projection and present users of that struct with a fully safe API.

Yes, types can opt-in to declare the projection safe. But e.g. the borrow checker will not understand that this is a field access, so given a PinMut<Struct> you cannot use such methods to obtain PinMut to two different fields at the same time.

MicahChalmer commented 6 years ago

But do we have agreement that changing Drop as I suggested would fix the issue?

I agree, that would fix it.

We wouldn't even need PinMut any more. PinMut<'a, T> in your proposal could be defined as &'a mut T, right?

No, PinMut<'a, T> is still required to promise that the referent will never move again. With &'a mut T you could only trust it not to move for lifetime 'a. This would still be allowed, as it is today:

struct X;
impl !Unpin for X{}
fn takes_a_mut_ref(_:&mut X) { }

fn borrow_and_move_and_borrow_again() {
    let mut x = X;
    takes_a_mut_ref(&mut x);
    let mut b = Box::new(x);
    takes_a_mut_ref(&mut *b);
}

It would be safe to go from PinMut<'a, T> to &'a mut T but not vice versa - PinMut::new_unchecked would still exist, and still be unsafe.

This is essentially the ?Move proposal that has been discarded previously due to backwards compatibility and language complexity issues.

As I understand it, the ?Move proposals were trying to get all the way to not needing PinMut, by changing the fundamental language rules to prohibit the code snippet above (by making Unpin be a Move trait.) I'm not proposing anything like that - my proposal is to start at exactly the way it is in nightly now, plus:

That's all - no fundamental language changes to how moves work, or anything like that. My claim is: yes, this would be a breaking change, but it would have less breaking impact than changing Drop (which in turn is still less drastic than the ?Move proposals), while retaining many of the benefits. In particular, it would allow safe pin projection (at least for private fields, and I think even for public? not quite sure) and prevent situations where parallel code has to be written to work with PinMut and &mut.

RalfJung commented 6 years ago

No, PinMut<'a, T> is still required to promise that the referent will never move again. With &'a mut T you could only trust it not to move for lifetime 'a.

I see. Makes sense.

As I understand it, the ?Move proposals were trying to get all the way to not needing PinMut, by changing the fundamental language rules to prohibit the code snippet above (by making Unpin be a Move trait.)

Understood.

My claim is: yes, this would be a breaking change, but it would have less breaking impact than changing Drop

Which change to drop do you mean? Adding an Unpin bound? You may be right, but I have no idea how widely used mem::swap and mem::replace are.

In particular, it would allow safe pin projection (at least for private fields, and I think even for public? not quite sure)

I don't see how private vs public can even make a difference here. How would a public field allow less than a private field?

But yeah, this seems to hold together overall. Future would still take a PinMut because it has to rely on things never moving, but it'd have a wider range of methods available for use.

However, the compatibility aspect is a big one. I don't think this is realistic, it'd break all generic code that calls mem::swap/mem::replace. Plus, right now unsafe code is free to implement these methods itself using ptr::read/ptr::write; this could lead to silent breakage of existing unsafe code. That's a no-go, I think.

kjetilkjeka commented 6 years ago

While we're on the topic of introducing Unpin bound on mem::swap and mem::replace (and not being concerned about breakage). If we assume the "compiler built in" route is taken. Would it be possible to also introduce the same bound on mem::forget to guarantee destructors being run for stack pinned variables making thread::scoped sound and avoid "pre-pooping your pants" in certain cases?

RalfJung commented 6 years ago

Notice that mem::forget on a PinBox is stlll allowed. The new proposed guarantee related to drop is NOT saying "things don't leak". It says "things don't get deallocated without drop being called first". That's a very different statement. This guarantee does not help thread::scoped.

carllerche commented 6 years ago

To add context, moving data out of a struct that implements Future is something that I do commonly. It comes up quite often to need to perform cleanup work if a future was never polled to completion (dropped before polling completed).

So, I most certainly would have hit this land mine when porting existing code to futures 0.3 even with documentation added to map.

RalfJung commented 6 years ago

@carllerche the map function says quite clearly that you must not use this to move anything. We cannot and do not want to protect against people deliberately moving out of a Pin<T>, but you have to get out of your way (using unsafe code) to do this. I would not call that a landmine.

So, which landmine are you referring to?

pythonesque commented 6 years ago

@RalfJung I have been trying to figure out the bounds on mapping pinned references myself and I do think this is going to be a huge footgun if it is not resolved soon. I think the first solution is preferable, despite is complexity; not being able to safely project to pinned fields makes it virtually impossible for consumers to actually use APIs that rely on pinning without writing unsafe code.

If this can't be done, I think in practice most usable APIs that use pinning will have to use PinShare. This might not be such a big handicap, I guess, but I'm still not clear on the relationship with Unpin in that case. Specifically: let's say I grab a pinshared reference and get a reference to a field on the type (for a certain lifetime). Can I really rely on it not moving once that lifetime is over? I probably can if the field is !Unpin so maybe it's fine, as long as Pin can't project out--I'm mostly worried about enums. Unless you are saying that even shared pinning cannot be safe without fixing drop--in that case, I think fixing drop to work with pinning essentially has to happen; otherwise it becomes a niche library feature that can't really be used safely, and doesn't (IMO) deserve any pride of place in the core language, even if it does happen to be very useful for Futures.

I should also mention that the only practical API I have for intrusive collections so far (I still need to work out the kinks) needs even stronger guarantees than that; it needs to be able to guarantee that drop isn't called as long as there is any borrow into the collection. I can do this using a GhostCell-style technique, but it feels very awkward and requires the user to do manual memory management (since we have to leak if the backing memory for something in the collection is dropped without being provided the token). So I'm a bit worried that automatic drop itself seems hard to make work with types that use pinning in interesting ways.

Out of curiosity: what are the arguments against adding an Unpin bound to Drop? You cited backwards compatibility, with the alternative being that you'd need to somehow automatically bound the thing that was being Dropped, but Drop already weird type system level restrictions that don't exist for other traits--why is this one so different? It's certainly not as elegant as just making drop take a Pin<T> but we can't actually make that change at this point. Is the issue that we don't actually know what to do if you do call drop on a type that only has a Unpin implementation, when the type itself is !Unpin? I'd guess throwing an exception in the drop implementation in that case might be the correct approach, since anyone who relies on drop running for generic types already needs to handle the panic case. That would mean that it would be very hard to use !Unpin types in practice without a bunch of people updating their code to use the new Drop trait (so the ecosystem would be forced to move everything to thew new version), but I think that I would be okay with that since it would still preserve soundness and not break code that didn't use !Unpin at all. Plus, the "your code panics if the library doesn't upgrade" thing would really incentivize people to move on!

pythonesque commented 6 years ago

In fact, here's my proposed design:

Extend the Drop trait with a second method that takes Pin, as you proposed. Make the default implementation panic. Make a specialized default implementation where T: Unpin call drop (I assume this would pass the current specialization rules? But even if it wouldn't, Drop can always be special-cased; plus, Drop functions are usually autogenerated). Then, you could magically switch the drop function to call this new drop_pinned (so the drop function as currently written can't be called directly). Now you have exactly the same behavior I proposed above, with no backwards compatibility issues and no attempt to awkwardly auto-derive a bound. It does make the Drop trait excessively strange (since you have a function named drop that you can't call directly and a function not named drop that you call whenever you write drop), but Drop is already so hardcoded and strange that I don't see this as a very big deal compared to the alternative (!Unpin types being essentially impossible to use [not just implement] without writing unsafe code).

It does have the problem that third party libraries will have to upgrade for them to be practically useful with !Unpin types, but like I said this is arguably a good thing. And for probably close to 100% of existing Drop implementations the change would just be a change to the function signature--well, I guess maybe not at the moment because you can't safely project a pinned field, but maybe a macro could be provided for that purpose? In any case, I have no idea how many drop implementations actually mutate their fields in ways that require &mut--most of the ones I can think of that do anything interesting either use unsafe or use interior mutability, but I'm probably not thinking of some common usecases.

The main thing about this approach is that were it to be taken, it'd have to be taken before Pin was stabilized. If you waited until after it was stable, it wouldn't be backwards compatible anymore to make drop implementations start throwing exceptions for generators. This is one more reason I'm really hoping Pin is not rushed in. I don't think we've explored the design consequences sufficiently.

(I do see one more potential issue: ManuallyDrop and unions in general mean that someone could probably have written a destructor over a generic type that did assume that the drop implementation within it couldn't panic, simply because it was never able to run (it also wouldn't be allowed to call any other &mut functions on the generic T, except for ones implemented on unsafe traits that promised not to panic). Since Pining a type already must guarantee that its drop implementation runs before its memory is destroyed [according to the new semantics], I believe the !Unpin type within a ManuallyDrop used for that purpose in existing code could not be considered pinned in the first place, so the existing code should still be correct if it's making that assumption; certainly, you shouldn't be able to safely project a pin into a ManuallyDrop, since it can only be safe if you guarantee its destructor is called prior to the drop. I just don't know how one would communicate this case to the compiler [since whether the !Unpin panic can be propagated without violating semantic expectations depends on whether you decided to call &mut functions other than drop or those on the aforementioned unsafe traits on the interior T, and how is the compiler supposed to know that? Can this take advantage of the "eyepatch" stuff at all, as it seems like it's intended for a similar purpose?]. Maybe you can just say all unions are Unpin by default since it's not generally safe to project a pin into their contents, so the Pin state can't enforce different guarantees from the &mut one; I'm not really sure that flies semantically, but it probably works for the purposes of the Drop implementation for existing code. For new code, hopefully a ManuallyDrop that you could project a Pin into would need to use unsafe code and [semantically] thus need to be marked !Unpin, so you'd still be okay there.

On the subject of eyepatch, though... I'm still not certain of the exact formal definition it has, but if the general idea is that no interesting generic functions are called on T, maybe one could exploit that further? Eyepatch-respecting generics could be known [through the aforementioned requirement that actually pinning a type means drop has to be called before the memory is deallocated] to always call the drop implementation of any !Unpin type that they own, and nothing else. That means that if the drop_pinned implementation for the !Unpin type was implemented, the eyepatch-respecting container would behave correctly. So this would obviate the need for !Unpin to be propagated on generic bounds in these cases. It seems possible to me that one could then compile time failure for !Unpin types that implement Drop, don't implement drop_pinned, and don't eyepatch their generic parameters, in the same way we do when you use non-eyepatch containers with self-referential lifetimes.

Existentials pose a serious backwards-compatibility risk with any compile time strategy though; with self-referential lifetimes, I believe we have a "strictly outlives" requirement that [combined with the fact that existentials carry lifetimes] can enforce drop acyclicity even if you don't know anything about the underlying type, but that's definitely not the case for Unpin, and not allowing drop implementations to compile for Box<Trait> would be a huge breaking change since it would have a lot of false positives. That's why I think a solution that fails at runtime is more realistic. Another alternative would be to require a ?Unpin bound on trait objects that might not implement Unpin; this wouldn't have runtime failure and wouldn't have false positives. However, I get the sense that the Rust team is pretty opposed to new traits of this sort).

Edit: Actually, scratch all the above: we only really need to worry about accessible fields in destructors, not &mut access generally, correct? Because we're only worried about implicit field projections from &mut self.

I may just be reinventing the !DynSized proposal, but essentially: generic containers must already themselves be !Unpin if they expose any methods that care about the pin typestate (I realize this sounds suspiciously like the incorrect parametricity argument, but hear me out!). Existentials like Box<Trait> and &mut Trait don't necessarily reflect their Unpin status, but (at least from staring at the current APIs?) I don't think Pin<Box<T>> and Pin<&mut T> necessarily have to be coercible to Pin<T> where T: !Unpin; not implementing that would mean pinned references to these types won't safely provide pinned access to their interior (note that there is some precedent for this with the relationship of &mut and &: &mut &T is not coercible to &mut T, only &T, and Box<&mut T> is not coercible to Box<T>, only &mut T; in general, when different typestates collide they don't have to be automatically propagated). I do recognize that usually &mut T, Box<T> and T are considered totally interchangeable from a typestate perspective, and this argument doesn't extend to hypothetical inline existentials, but maybe this is the substance of the DynSize proposal (don't allow safe swaps or mem::replaces for trait object values, I guess? Actually, it's already disallowed for them... but I'm assuming there's some reason why this might change in the future). That makes a compile time solution very straightforward: for any structure which (transitively) directly owns (no &mut, &, Box, or raw pointers--none of which would safely transitively propagate Pin access, except & for shared pin, but & can't be moved out of anyway--or if you decided to go with the "can't move out of trait objects" solution, you could also check fields transitively along &mut and Box I think) and has access (in a visibility sense) to a known !Unpin type (including itself), it would have to implement the second kind of drop. That seems to me to be totally fine and not be a backwards compatibility footgun at all, since no stable types are !Unpin--people might have to reimplement destructors after updating a library, but that's all, right? Moreover, internal implementation details would stay internal; only if a !Unpin field were exposed would there be any issues. Finally, all generic container types (not just Vec and standard library stuff, but essentially the entire crates.io ecosystem) would continue to work fine. What is the catastrophic thing I am missing about this solution?

(In fact, it seems to me that even if you couldn't enforce this at drop definition time, you should at least be able to do so at type instantiation time, like dropck does, since you only need to worry about fully instantiated types).

Rereading the Dynsized proposal: I observe that an argument against it was requiring immovable types to always be DynSized even before they are pinned. I argue above that we really only need to worry about this in the case of trait objects; it might be possible (although hard to justify) to enforce that coercing a !Unpin type to a trait required explicitly bounding it with + ?DynSized (or whatever; it could be done automatically, I suppose). While there may well be many cases where the size must be known for !Unpin types (indeed, I have such a use case!), or they must be able to be swapped before they are pinned, I hope there are very few such cases for the interiors of trait objects made from immovable types (the only cases we have now are for stuff like the Box -> Rc conversion that we want to explicitly forbid, right? It's actually not even clear to me that size_of_val being exposed is really an issue here or not, since I still don't know whether you are expected to be able to turn Pin<'a, Box<Trait> into Pin<'a, Trait>, but if you can't we can just rely on Sized of course). One really wants to be able to bound them with !Unpin in any case but as I said, I think people want to avoid adding more negative traits than we already need (personally, I hope that !Unpin trait objects are going to be sufficiently rare and specialized that bounding trait objects with ?Unpin rather than Unpin would be totally reasonable and not infect the type system too much; most of the uses for !Unpin I can think of wouldn't have useful implementations for most existing traits, since they would generally want to perform actions on Pin<self>. You'd also usually want to use them with PinBox or PinTypedArena or whatever at which point ?Unpin bounds look pretty natural).

pythonesque commented 6 years ago

I have a new design, which I think isn't nearly as awful as the old one: https://github.com/pythonesque/pintrusive/blob/master/src/main.rs. Instead of trying to make pin projections work everywhere, this design asks how we can interoperate with "legacy" Rust code that doens't know anything about pinning, from "modern" Rust code that always wants to support pinning. The obvious answer seems to be using the PinDrop trait @RalfJung proposed, but only for types that both have a custom drop, and want to project fields.

Types explicitly opt into projecting fields (corresponding to deriving the PinFields trait), which is analogous to code written in "modern" Rust; however, it makes no additional requirements on code in "legacy" Rust, instead opting to only support projecting to depth 1 for any given derivation of PinFields. It also does not attempt to move Pin through references, which I believe is probably a good idea not to do anyway. It supports structures and enums, including any disjointness analysis Rust should be able to provide, by generating a structure with identical fields and variants, but with the types replaced with Pin'd references to the types (it should be trivial to extend this to Pin and PinMut when that change is made). Obviously this is not ideal (though hopefully the optimizer can get rid of most of it) but it has the advantage that it works with borrowck and NLL and easily works with enums (as opposed to generated accessors).

The safety argument works by ensuring that either Drop is not implemented for the structure at all, or ensuring that if Drop is implemented for it, it is a trivial implementation that only calls PinDrop (a version of Drop that takes Pin). I believe this rules out all the soundness issues with projecting pinned fields, with one question mark: my main concern is finding a good argument for why disjoint fields on the exact same container (that is, fields at depth 1) that could invalidate each other's pins in their destructors, would already be invalid without pin projections. I think I can justify this if we can show that you could also do the same thing if they were held in separate PinBoxes, which implies that where they live is part of their safety contract; that means their destructors aren't safe in isolation and constructing them outside the module would require unsafe code. In other words, their correctness hinges upon the implementation of the container type, which means that it should be okay to ask more of its destructor than we would for arbitrary safe code.