rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.61k stars 12.74k forks source link

Backwards compatibility hazard with `Box` + `&mut` behaviour that cannot be replicated by a pure-library type #14270

Closed huonw closed 9 years ago

huonw commented 10 years ago

With Box built deeply into the compiler, one can currently write

let mut y = 1;
let x: Box<&mut int> = box &mut y;
**x = 2;

That is, one can mutate through a &mut stored in the Box, even if the Box itself is not in a mutable slot. This is (presumably) because the compiler currently understands a lot about Boxs behaviour, in particular, that it can act like an immutable unaliased pointer (&uniq).

There's currently no way to get this behaviour for library types, e.g. Vec<&mut int> is theoretically identical to Box in terms of ownership/aliasability of its contents, but, as it is a library type, the following cannot be made to work:

let mut y = 1;
let x: Vec<&mut int> = vec![&mut y];
// error: can't mutate a `& &mut int`
**x.get(0) = 2;
// error: can't borrow x as mutable
**x.get_mut(0) = 2;

We would need an immutable &uniq pointer for this behaviour to be possible for general simply-owned types like an in-library Box, Vec, HashMap etc.

Having this behaviour for Box represents a possible backwards compatibility hazard if we ever decide to move Box more into libraries. Specifically, moving Box into a library and preserving its current behaviour perfectly would require (at least) adding this new &uniq pointer type.

(Thanks to /u/SteveMcQwark on reddit for prompting this.)

(Nominating.)

alexcrichton commented 10 years ago

Learn something new every day!

pcwalton commented 10 years ago

I never considered it a priority to be able to move Box fully into the library for 1.0, and it's clear that more machinery will be necessary. So I'm not sure this is a hazard.

pnkfelix commented 10 years ago

Assigning P-low. And maybe we should actually close this.

cc: @nikomatsakis

nikomatsakis commented 10 years ago

The real question is whether we will ever support other "unique" pointer types in the library. It is not entirely clear to me what the best resolution is here.

lilyball commented 10 years ago

The issue with Vec<&mut int> is one I hadn't considered before, and is rather unfortunate. &uniq could be leveraged in some fashion to try and fix this, but of course we were hoping to remove &uniq with by-value closures.

Theoretically, borrowck could try and track whether a & reference is actually non-aliased, where a non-aliased & reference is one derived from something else known to be non-aliased (such as a non-borrowed local variable), or derived from an existing non-aliased & reference (including being returned from a function that declares it as having the same lifetime as the non-aliased & parameter). It would have to be able to "downgrade" a non-aliased & to a regular & as well, if the local variable or any "parent" reference is accessed while the non-aliased & is still alive. This would allow *v.get(0) = 2 to work without any changes to the libraries or user code, and similarly should allow Box<&mut T> to work, but is rather complex.

A more limited suggestion would be to have borrowck merely consider the results of Deref<T> to be non-aliased if the dereferenced value is known to be non-aliased. deref() returns a &-reference but that reference is hidden from the client code. A separate attribute can be used to declare that a smart pointer is considered unique, and the two combined would allow Box<&mut T> to be implemented fully in the library. But this approach doesn't fix Vec<&mut T>.

lilyball commented 10 years ago

Actually, thinking about it some more, let x: Box<&mut int> = ..; let y = *x; needs to move the &mut int out of the box and into y, and then the Box destructor needs to know that its value is moved (which right now can be done with zeroing, but AIUI we're eventually going to move to precise tracking). I don't know how a library-implemented Box would know about the movement.

huonw commented 10 years ago

Theoretically it can be reborrowed, i.e. it could be equivalent to let y = &mut **x;. (Also, &mut doesn't have a destructor and so doesn't need zeroing: the relationship between moves and zeroing is just to stop spurious/incorrect destruction of invalid values.)

(Frankly I don't think we need to preserve this behaviour if we do decide to make compiler types and library types equal in this respect, it is almost certainly a very rare edge case and can easily be fixed by adding a mut.)

lilyball commented 10 years ago

@huonw mut doesn't replicate current behavior:

fn main() {
    let mut a = 3i;
    let b = box &mut a;
    let c = &b;
    **b = 4i;
    println!("{}", c);
}

Note that I can still assign to **b despite having an outstanding borrow on b.

More importantly, you can't say let m = &mut **b; due to the outstanding immutable borrow of b, which means that taking a &mut reference is not equivalent to the ability to modify the box.

The question is, is the current behavior worth keeping, or can all clients of the current behavior be made to work with mut and &mut just as well?


Also I'm mildly concerned about the fact that I can mutate the boxed &mut T despite having an outstanding immutable reference to the box. Is there any aliasing issues here? I'm guessing no, because the reference I have points to the unique pointer, and the unique pointer is what I mutated through, so reading the value through the reference does ultimately read through the same pointer used to mutate, but I don't have a complete understanding of LLVM's data dependency aliasing rules.

huonw commented 10 years ago

That just seems like a bug (filed as https://github.com/mozilla/rust/issues/14498 ):

fn main() {
    let mut a = 3i;
    let b = box &mut a;
    let c = &b;
    let d = &***c;
    **b = 4i;
    println!("{} {}", c, d);
}
lilyball commented 10 years ago

Ooh, you can get a &int and still mutate through the box? That's no good. Note that the simpler form is correctly identified by borrowck as an error:

fn main() {
    let mut a = 3i;
    let b = box &mut a;
    let c = &**b;
    **b = 4i;
    println!("{}", c);
}
nikomatsakis commented 10 years ago

@kballard this is #12624 . I am increasingly of the opinion that we should just remove all special treatment of Box<T> from mem-categorizer. This implies that (e.g.) moves out of *x would not be legal. So long as we are not going to permit DSTs to be passed by value, as seems to be the case, then this is fine, we can just write x.unwrap().

nikomatsakis commented 10 years ago

Oh, my mistake, I see it was not #12624. Anyway, looks like it was filed, fixed, and reviewed before I had a chance to get to this e-mail. :)

lilyball commented 10 years ago

@nikomatsakis I'd still like to see the ability to move out of &mut T as long as you move something back in later, although I don't know how that will interact with failure. This could conceivably be used to allow from moving from Box<T> via *x if there's some way to tell the Box not to destruct its content.

nikomatsakis commented 10 years ago

@kballard yes to all of that.

nikomatsakis commented 10 years ago

@kballard except the last part about telling box not to destruct its content. I'm not sure of the point of that. But if we did have that capability to move/replace from &mut, you could do a swap like:

let b = box ...;
let x = &mut *box;
let y = *x; // move out of *x
*x = ...; // replace

(This would basically be the equivalent of swap today)

If we did at some point add the theorized &move pointer, that would be a way to support moves out of boxes with uniform support for DST/non-DST. But I am not a big fan of this pointer type as of yet. I'd like to have a bigger idea how much, if at all, we'll care about this use case.

lilyball commented 10 years ago

@nikomatsakis Yeah if you replace the value it's fine, but if you could move out without replacing and just tell Box not to drop its value, then that would restore the current behavior where you can simply move out of a box.

bstrie commented 9 years ago

I know that this has already had a nomination removed once before, but I'm renominating because I think the reasoning given was hasty. It is still a backcompat hazard to some degree, because if we ever want to move Box into a library, which everyone seems to desire, it will then tie our hands and require us to first implement certain features solely for the sake of maintaining backcompat.

In the meantime, we could free ourselves of that obligation by restricting the behavior of Box to only what the existing language semantics permit for user-defined types. The functionality that exists today could be added backwards-compatibly in the future if we decide to implement the needed features.

It's fine if you want to immediately remove this nomination, just please acknowledge that it effectively amounts to a binding promise regarding future features that will be added to the language.

ftxqxd commented 9 years ago

I’d like to bring up RFC 130, which removes some special casing of Box in the borrow checker for backward-compatibility reasons. That RFC was accepted, and aims to solve the same issue that this one does. I would not expect fixing this issue to break much code at all—Box<&mut int> is probably not used very often, and it’s easy to fix, anyway. So I think it makes sense to resolve this issue before 1.0: it’s a fairly minor change, and it’s part of an existing attempt to make it backwards-compatible to move Box out of the language post-1.0. The way I see it, we have three options:

  1. Keep this behaviour as it is right now, and add some new mechanism (probably &uniq) that might be a major addition to the language post-1.0 to make moving Box out of the library backwards-compatible;
  2. Make a relatively minor change to the language now (disallow mutating the T in any Box<&mut T> when the Box is in an immutable slot), and possibly add a generalised way of doing the same thing for library types post-1.0;
  3. Never move Box out of the library until the next breaking release, which could be years away. I hope this doesn’t happen.
brson commented 9 years ago

1.0 P-backcompat-lang

zwarich commented 9 years ago

I started working on this. Since the mutability checks are in terms of mem categorizations rather than loan paths, this is a bit trickier than I expected, but I think I know how to do it.

aturon commented 9 years ago

Re-nominating. This needs to be triaged again as part of our box story for beta.

pnkfelix commented 9 years ago

Assigning 1.0-beta milestone.

pnkfelix commented 9 years ago

nominating: someone needs to own this issue for the beta.

pnkfelix commented 9 years ago

assigning to self. (We may or may not resolve this by the beta, but we must address whether we are fixing it by then.)

theemathas commented 9 years ago

What is the plan to fix this? DerefMove, Box::into_inner(), or something else?

pnkfelix commented 9 years ago

We plan to change things to require the mut annotation on the binding in the example.

For now its a bug that it is not required. We plan to fix the bug as part of 1.0 polish.

pnkfelix commented 9 years ago

i have a potential patch for this. (It is a kludge/hack that revises the cmt::freely_aliasable function (which is only called in spots that want to check if the input cmt is a precise path that we can thus uniquely write through) so that its result is more fine-grained, i.e. able to report that the cmt in question is not freely-aliasable but is nonetheless not writable (since it is e.g. going through an immutable container like the scenario described in this ticket).

nivkner commented 6 years ago

@pnkfelix the issue isn't fixed, the following code works on rust 1.28

let mut y = 1;
let x: Box<&mut i8> = Box::new(&mut y);
**x = 2;
pnkfelix commented 6 years ago

@nivkner Oh at this point I think its fair to say that we have punted, at least in the NLL work, on trying to avoid special casing of Box

See for example PR #52782

I do think there is talk of eventually allowing users to express similar features in their own library data types.

But I’m not sure that warrants reopening this particular ticket

pnkfelix commented 6 years ago

Though I am curious ... I added a test when I “fixed” this originally ... did the test not actually cover this case, or has it regressed ...

pnkfelix commented 6 years ago

Wow apparently the test I put in that commit didn’t actually include @huonw’s original basic example ... it jumped straight into a lot of other scenarios (that are all admitedly important to test) boy oh boy ...

pnkfelix commented 6 years ago

Wait no, it did include the new diagnostic I added; I Just didn’t see it at first

It was on the fn borrow_in_var_from_var_via_imm_box()

But still, that case has an intervening borrow

pnkfelix commented 6 years ago

Ah okay PR #40841 got rid of the diagnostic, and I didn’t notice (or chose not to mention it) during review of that PR

(Probably should have had a legit regression test focused solely on huon’s example as written.)

Okay my curiosity is satisfied. Still not planning to reopen this ticket. :)