Open RalfJung opened 1 year ago
We could attempt to allow code like the above by somehow having a notion of "end of scope" for a mutable reference, and having its no-alias requirements end completely at that point.
I think this is what people naively expect. And if we had this property I think it would ease some of the scenarios where we fall back to advising that people not mix references and raw pointers; this is undoubtedly good advice for avoiding UB but it is unpleasant to give up references.
Is that worth it? And how exactly should that work? Each reference remembers the function it was created in, and when that function ends, it ceases to impose aliasing requirements?
That last sentence sounds tantalizing, but I'll admit it is really just because I would like the as_mut_ptr
example to be permitted. Maybe some day I'll dig up examples of people running into this in the wild.
The bigger concern is that this makes the function boundary very special
Yeah, this issue. Should we have a whole separate issue for this? I don't know how much this can be its own discussion; the aliasing optimizations implemented in C/C++ compilers seem deeply tied to function boundaries and I don't know if that is fundamental or legacy from C.
I am generally concerned about tying UB too much to function boundaries; I think it would be deeply surprising if during refactoring I found I eliminated all but one caller to a helper function and decided to manually inline it into its caller so that I could apply some simplification... then found that I had added UB. But on the other hand, I think the very existence of protectors indicates we already have the opposite refactoring hazard; adding a helper function can introduce UB. That seems more dangerous, and yet I have only a handful of examples of deallocating against a protector: https://miri.saethlin.dev/logs/alloc-stdlib/0.2.2 https://miri.saethlin.dev/logs/tinyset/0.4.15 https://miri.saethlin.dev/logs/usync/0.2.1 https://miri.saethlin.dev/logs/hamt-rs/0.3.0
Each reference remembers the function it was created in, and when that function ends, it ceases to impose aliasing requirements?
Ah right, under that variant naive inlining becomes unsound. That's certainly not tenable. We would at least need some way to still have the "end of scope" happen after inlining.
That last sentence sounds tantalizing, but I'll admit it is really just because I would like the as_mut_ptr example to be permitted. Maybe some day I'll dig up examples of people running into this in the wild.
We're getting the first concrete requests for optimization that even come with concrete LLVM proposals that are incompatible with as_mut_ptr
. So maybe the right answer is to start finding a way to separate functions like as_mut_ptr
from the majority that is completely fine with "full" mutable reference uniqueness.
the aliasing optimizations implemented in C/C++ compilers seem deeply tied to function boundaries and I don't know if that is fundamental or legacy from C.
C has restrict
, which is indeed tied to function boundaries. I don't think C++ has anything like this.
LLVM has noalias infrastructure that I haven't understood in detail yet. But I think it boils down to statically marking each load/store as being within some "noalias scope". The regular function-argument-level noalias is then simply equivalent to putting all loads/stores inside the function into its scope, but smaller scopes can be had within a function. In Rust this would correspond, for each &mut
, to statically determining which loads/stores are "in the scope" of that borrow, and mark them appropriately. What SB and TB are doing is to dynamically determine that scope to last until "the last time that reference (or pointers derived from it) is used". But a noalias scope in LLVM cannot leave the function it starts in (I think).
However LLVM is also getting new noalias infrastructure and I know even less about how that works.^^
But no other language I know permanently remembers the full tree of how pointers get derived even as the control flow moves back up the call stack. When it's all safe code I think it makes perfect sense to keep that tree, but for raw pointers... well really it's just functions like as_mut_ptr
where this makes me uncomfortable and really those functions should just tell the compiler "don't create fresh references here".
well really it's just functions like as_mut_ptr where this makes me uncomfortable and really those functions should just tell the compiler "don't create fresh references here".
How much would be solved by as_mut_ptr
being autoref-compatibly defined as fn(*mut self) -> *mut T
instead of fn(&mut self) -> *mut T
, with autoref doing an addr_of_mut!
?
If we can solve most of the unexpected UB issues with that sort of change to the language and stdlib, I think that's preferable to weakening &mut
as posited in the OP.
I think it would have to still be technically &mut
, for backwards compatibility reasons. But perhaps it could be a weaker &mut
somehow. I have long suspected there are not too many functions that would need this property, aside from slice as_ptr/as_mut_ptr/len
How much would be solved by as_mut_ptr being autoref-compatibly defined as fn(mut self) -> mut T instead of fn(&mut self) -> *mut T, with autoref doing an addr_of_mut!?
AFAIK those kind of functions are the main motivation for not having spurious writes on &mut
.
However the function might look more like
fn as_mut_data_ptr(&mut self) -> *mut Data { addr_of_mut!(self.data) }
and annoyingly the version of that that takes a raw pointer would have to be unsafe.
And there is the question whether we want to force users to write such "raw pointer getters" in this particular style -- that does sound like quite the footgun.
I think it would have to still be technically &mut, for backwards compatibility reasons.
Yeah I guess let f: fn(&mut _) -> _ = <[i32]>::as_mut_ptr
still has to work.
Another way to phrase this would be: under SB and TB, a pointer P is "live" as long as any pointer derived from it is live -- even if the function that created P has long since returned.
To me that seemed like the most obvious way to operationalize what the borrow checker is doing. But it is indeed not how noalias
works in LLVM. OTOH, this definition is much less tied to having some idea of scope or function boundary, and thus may help us when dealing with inlining. So, it's not clear to me that going for a weaker model with some (statically determined) notion of "scope" is a good idea.
We could attempt to allow code like the above by somehow having a notion of "end of scope" for a mutable reference, and having its no-alias requirements end completely at that point.
I think this is what people naively expect.
Here's an example of an optimization that was just brought up in a discussion as "example of an optimization Rust should be able to do one day".
I am fairly sure that in a function like fn nocapture(&mut T)
, when that function returns we want to consider that reference dead. Or, if we can't use the function boundary (it's unclear how to use that in general), then we should use other clues that the lifetime is over, such as a write to the original referent.
Of course, unlike as_mut_ptr
, this function doesn't return anything, so it "obviously" should not extend the lifetime of the pointer. But how obvious is that? If we say that fn(&mut T) -> *mut T
returns a pointer that allows arbitrary aliasing with the original referent, then what about fn(&mut T, &mut T) -> *mut T
-- are now both pointers potentially aliased, or only the one that actually gets returned? But what does "returned" mean? What if it is returned in a larger compound value, maybe even behind a pointer indirection? Maybe someone returns a pointer via destination-passing style (out: *mut Ret
arguments), so the function actually has return type ()
. Is that where we draw the line? This is a rabbit hole that I am not sure has a bottom...
OTOH, the current model is very simple: we always track the ancestry of all pointers, and use that to determine conflicts. &mut
creates child nodes, but raw pointers are "clones"/aliases of the references they come from. The only thing that makes this not simple is all the implicit references Rust creates... but I worry that fixing that with bespoke rules for when a function is nocapture
vs when it is as_mut_ptr
is overall going to be more complicated and harder to predict than dealing with these implicit references.
Here's the intuitive model that I held until yesterday when I started reading about stacked borrow semantics. Every reference to a T
owns a lock of sorts on the T
's memory, and the lock is released when the reference is dropped. Any time you read or write a through a pointer to a location that a mutable reference has a lock on, or write through a pointer to a location that a shared reference has a lock on, that's potential UB. The only exception is when you use the pointer in exactly some way that you could have just used a reference and are following all the same rules that the borrow checker enforces on references. Basically, any memory that's either owned on the stack or accessible through a reference, the compiler has expectations about, and it's responsibility of any access performed through a raw pointer not to violate those expectations. But, when memory is only reachable through (non-Unique
) pointers, it's legal to use those pointers in any way that would be legal in C. A *const
pointer can get invalidated if you write to a *mut
pointer which aliases it, but any *mut
pointer which was valid at birth remains valid for as long as the memory which it points to remains allocated.
The stacked/treed borrow rules screw up this intuitive model by making it possible for a *mut
to get invalidated by writing through a &mut
, or even by writing through another *mut
which was derived from a &mut
. I think any ruleset which has this property is far too dangerous, and whatever optimizations it might enable are not remotely worth the UB that they're going to cause.
The stacked/treed borrow rules screw up this intuitive model by making it possible for a *mut to get invalidated by writing through a &mut
That's unfortunately fundamentally necessary, so I'm afraid your mental model is a bit too optimistic here.
or even by writing through another *mut which was derived from a &mut
That, too, is fairly fundamental -- avoiding this means we can basically entirely ditch the idea of reference-based optimizations entirely. Consider:
fn foo(x: &mut i32, y: &mut i32) {
let xraw = x as *mut i32;
let yraw = y as *mut i32;
// If the two pointers alias, this is UB: the write to xraw invalidates yraw.
xraw.write(0);
yraw.write(0);
}
This issue is about the fact that Stacked Borrows and Tree Borrows are more restrictive than LLVM noalias
, but the points you rise apply even to LLVM noalias
(and to C restrict
). It is almost certain that code like the above will be UB.
I don't think I understand why the example you've written poses a problem. In C, it's UB if arestrict
pointer is aliased by another restrict
pointer, but it's legal for a non-restrict
pointer to alias a restrict
pointer. Rust has no need for a restrict
keyword, because mutable references are in effect always restrict
: two mutable references with overlapping lifetimes can never alias each other. So this code snippet is the moral equivalent of
void foo(int *restrict x, int *restrict y) {
int *xraw = x;
int *yraw = y;
*xraw = 0;
*yraw = 0;
}
Here if x
aliases y
, that's instant UB, even if the function body is empty. The write to xraw
isn't what's invalidating yraw
, because it was already invalid from birth. C programmers don't need consider the heredity of xraw
and yraw
to reason about a function like this. If yraw
was valid when it was last assigned, it's still valid later, unless something deallocated the memory it points to.
C's rules about restrict
pointers seem sufficient to permit plenty of optimization, even when they have non-restrict pointers aliasing them. For example, I wrote
void foo(int *restrict x, int *restrict y) {
int *xraw = x;
int *yraw = y;
*xraw += 1;
*yraw += 1;
*xraw += 1;
}
and looked at how this is assembled on aarch64. I got just the result I expected:
ldr w8, [x1]
ldr w9, [x0]
add w8, w8, #1
add w9, w9, #2
str w8, [x1]
str w9, [x0]
Clearly clang or LLVM was able to deduce that xraw
doesn't alias yraw
even though they aren't directly declared as restrict
, because otherwise the optimization you see was performed here wouldn't be sound. So I'm not clear on why Rust would need to have any more UB than C does in order to be able to exploit noalias
.
On the other hand, here's a case where Rust could have more UB and I wouldn't find it perverse:
fn foo(x: &mut i32) -> i32 {
let xraw = x as *mut i32;
unsafe { xraw.write(0); }
*x
}
The equivalent in C with int *restrict x
would be fine, and indeed the restrict
would be meaningless since there are no other restrict
arguments. But I'm okay with this Rust program being UB, because it's writing to a pointer which aliases a mut reference while that reference is still alive, and in a way that the borrow checker wouldn't permit if the pointer were replaced by another mut reference. I expect UB out of that.
I don't think I understand why the example you've written poses a problem. In C, it's UB if arestrict pointer is aliased by another restrict pointer, but it's legal for a non-restrict pointer to alias a restrict pointer. Rust has no need for a restrict keyword, because mutable references are in effect always restrict: two mutable references with overlapping lifetimes can never alias each other. So this code snippet is the moral equivalent of
This is an incorrect assessment of the semantics of restrict
. restrict
pointers exclude all other accesses to the objects its used to access (except insofar as read accesses do not exclude other pointers from performing read accesses).
Per N3301 (latest draft for C23),
In respect to defined behaviour, TB mutable references are actually weaker in many (but not all) ways than restrict. restrict
applies until the end of the lexical scope of the pointer it is applied to, mutable references aliasing rules apply until last use (except in a function parameter, where protectors get involved)
Here if x aliases y, that's instant UB, even if the function body is empty
No, that's not correct. restrict
is about the accesses performed through the relevant pointers (and all pointers derived from it). The pointers may be equal as long as the accesses performed through them are disjoint. Furthermore, he accesses may even overlap if they are all read accesses.
I suggest you read the relevant parts of the C standard; it doesn't say what you seem to think it says.
it's legal for a non-restrict pointer to alias a restrict pointer.
This is wrong, too. This example has UB in C if the pointers alias:
void foo(int *restrict x, int *y) {
*x = 1;
*y = 1;
}
I don't think I understand why the example you've written poses a problem.
"problem" in which sense? You don't understand why it is UB, or you don't understand what it has to do with your argument? Your argument was that it is bad that "it possible for a *mut
to get invalidated by writing through [...] another *mut
which was derived from a &mut
". My example shows some code where exactly that happens and yet we unquestionably want that code to be UB. Therefore, it shows that your intuition doesn't hold up even for the most basic aliasing rules.
If you have further questions about why certain examples are UB, I'd ask you to take that to a new thread, so that we can in this thread collect arguments for and against the question stated in the OP, without filling it with misunderstandings about the basic concepts being discussed here. We have a Zulip stream where we're always happy to answer questions, or you can open a new issue here with a concrete question if you prefer Github.
But I'm okay with this Rust program being UB
There is no world in which it is okay for this program to be UB. You can write basically the exact same thing in safe code:
fn foo(x: &mut i32) -> i32 {
let x_notraw = &mut *x;
*x_notraw = 0;
*x
}
Obviously this code is fine, and therefore obviously the code you wrote should also be fine.
I am not sure what your intuition is based on, but it doesn't seem to be anywhere close to how Rust references work, so it is not surprising that your intuition would then clash with Stacked Borrows and Tree Borrows that are closely built on the way Rust references work.
That last sentence sounds tantalizing, but I'll admit it is really just because I would like the as_mut_ptr example to be permitted. Maybe some day I'll dig up examples of people running into this in the wild.
I have now come across a few such examples. For example, this code here uses it and trips up Tree Borrows (although not Stacked Borrows). Another one is this here (since fixed). Unfortunately, these issues in the wild are a bit more complicated, with there being several newtype wrappers and the proper way of fixing it being that in addition to exposing everything as a &mut [T]
everywhere, you also need to preserve ways of getting to the raw pointer. (I guess such code is quite rare, but I just spend several days looking at people breaking TB rules in seemingly reasonable ways that are fixable, but the fixes make the code more ugly, which is a very disappointing state of affairs).
If we only look at slice::as_mut_ptr
, then people learning Rust might look at the method, and see that it is simply a wrapper around a rather ugly cast. Unfortunately, this then introduces aliasing constraints into their code, which can be solved by removing the fancy abstraction and doing the raw casts manually. Unfortunately, that state of affairs is a bit unfortunate; the aliasing model and other restrictions of Rust force you to either make your code be nice and maintainable or make it correct, but not both, and that's not optimal.
Even further, I fear that in general, slice::as_mut_ptr
is a massive footgun because it seems to simply "convert" to a pointer, and because it seems to suggest that one can build useful APIs based on slices. But this is not what it does, and you in general can not use slices anywhere raw pointers are used, because as_mut_ptr
and the act of returning them imposes extra aliasing constraints. I feel like this should at least be mentioned somewhere in the documentation of that method, like so:
Note that the returned pointer is not allowed to alias with the mutable reference to the slice.
As long as the pointer is in use, any other reference not derived from the *result* of
as_mut_ptr should not be used.
Unfortunately, this then introduces aliasing constraints into their code, which can be solved by removing the fancy abstraction and doing the raw casts manually. Unfortunately, that state of affairs is rather silly. I don't want to go around evangelizing people to use a language where the aliasing model combined with other language restrictions makes it impossible to write clean code since you're not allowed to factor things into methods.
I agree with your assessment. I've had the same thoughts for a while, but I think this ship already sailed long long ago. The idea that function parameters are magic and have extra UB is fundamental to deferenceable
and noalias
; if they can insert speculative reads/writes upon function entry, factoring out any code into a helper function can add UB.
What's so frustrating about slice::as_mut_ptr
is that you would really want this method to be written like this:
pub fn as_mut_ptr(*mut self) -> *mut T [ //take a raw pointer
self as *mut [T] as *mut T
}
But sadly, Rust does not let you do it without some unstable feature (and it would probably break some existing code). And in general, in other abstractions that want to give out raw pointers without unnecessarily modifying the provenance, this would not work as simply. Consider
struct DataWithHeader<T: ?Sized>{
header: Foo,
data: T
}
impl <T: ?Sized> DataWithHeader<T> {
pub fn as_mut_ptr(&mut self) -> *mut T {
&mut self.data
}
}
That function is impossible to write when it takes a raw pointer, except if one makes the function unsafe
, or if one uses ptr::wrapping_add
(which loses optimizations..?).
A hacky solution to this is inventing an attribute #[no_retag]
(which, I am sure, people had come up with before), that allows such a functions to be written in safe code, but without treating the references as references in the aliasing model, and without adding noalias
and friends in LLVM. And then you can hopefully add this in enough places (like the Deref
trait when also used from unsafe code, or most as_mut_ptr
functions) that things just work.™ Arguably, for half the methods that currently cause headaches, this is the type people had in mind when they designed them.
So in the end, it all comes back to raw pointer ergonomics. If we want references to have very strict rules, the language should not box (no pun intended) you into using them all the time. What further causes problems is that many crates out there don't add enough "unsafe accessors" to give you raw pointers, instead all you get is a slice
and then it becomes impossible / very tedious to use that dependency in a sound way for your unsafe code. What you would need is a raw pointer, but obtaining that is impossible / very fragile even if you know what you are doing. Which again comes back to raw pointer ergonomics, if there was a AsRawPtr
trait widely used in Rust, this would remind people they need to add something like that.
The following code is fine according to LLVM
noalias
, but rejected by both Stacked Borrows and Tree Borrows:The reason (in Tree Borrows terms) is that when
ptr2
is created, it is considered as a "sibling" pointer toptr1
, and even though theirnoalias
scope is over, the model still remembers that these pointers are not identical and actions on one can disable the other. Put differently,ptr1
andptr2
are derived from mutable references that were created to callas_mut_ptr
, and those references are considered to be live as long as any pointer derived from them is live.This is probably going to be surprising. Is it a problem? Tree Borrows is deliberately stricter than LLVM noalias; we want to model Rust concepts, not just the LLVM attribute, and we are hoping to get benefits even inside functions where LLVM's function-scoped
noalias
cannot be used.We could attempt to allow code like the above by somehow having a notion of "end of scope" for a mutable reference, and having its no-alias requirements end completely at that point. Is that worth it? And how exactly should that work? Each reference remembers the function it was created in, and when that function ends, it ceases to impose aliasing requirements? For functions with a signature like
fn(&mut T) -> &mut U
, we certainly want to keep the no-alias requirement for the return reference around even after the function returned -- but maybe that can rely entirely on the caller doing a retag. The bigger concern is that this makes the function boundary very special (even more so than protectors), and the following code would still be UB:And... that seems okay? In fact I think we want that code to be UB even if
(ptr1, ptr2)
is returned out of the function and used in the caller (example below). That code creates two overlapping&mut
, I see no good reason to allow this code. But if we reject this code, then why would we accept the variant that usesas_mut_ptr
? Is it only because the&mut
is now implicit, an auto-ref? Should auto-refs have weaker semantics than explicit&mut
? That is insufficient, we wantnoalias dereferenceable
for&mut self
arguments. A combination of "auto-ref is somehow very weak" and "function-entry retags have an 'end of scope'" would be sufficient, though the details on what auto-refs do are fuzzy. Maybe they just don't retag at all. (Function-entry retags also don't correspond to an&mut
in the source, so it could be justified that they are somehow weaker.) However I think there are plenty of cases where we want aliasing assumptions even when there was no explicit&mut
, like on the references returned byget_mut
-style functions. Or should we devise something specific to methods likeas_mut_ptr
that suppresses the implicit reborrows (and also suppresses thenoalias dereferenceable
, which we really don't need for these methods)? That seems rather ad-hoc though.I wonder what others think, in particular in terms of which of these examples should be allowed (if any) and which not.