Closed nikomatsakis closed 2 years ago
That makes me very uncomfortable. You have an object of the correct size, and you're accessing it through a pointer which is safely derived.
Hm; that implies either a much stricter handling of pointers in unsafe {}
than the C model, or that this assumption must be disabled at function scope for functions containing unsafe {}
(perhaps even transitively).
In particular, in C, it's defined so long as the (offset) pointer is still part of the same "object" (read: allocation), while this model basically introduces a subtyping relation on "objects" (in the C sense).
One pattern this would break hard is container_of()
- a pattern I suspect is used in implementing intrusive collections in Rust, not just C.
container_of()
not only makes use of offset-pointer-inside-same-allocation being valid, it does so wiith a negative offset.
I use a variant of container_of
in intrusive-collections.
This code which does not cross function boundaries does a very similar thing:
let mut s = Struct { x: 0, y: 1, z: 2 };
let p: *mut i32 = &mut s.x;
*p.offset(1) += 1;
And yet I feel differently at fn boundaries. I think this is because I think about functions in terms of permissions. Having a safe function like this:
pub fn foo(p: &mut i32) {
let q: *mut i32 = p;
*q.offset(1) += 1;
}
seems completely and obviously wrong. It exceeds the permissions that was given to it (which applied to *p
, not *(p+1)
, and hence (in my mind) is an illegal function.
I think before we are going to be able to settle any of these kinds of questions, we have to lay out the higher-level views that we are coming from and try to reach some agreement there.
I don't think function boundaries should do anything (beyond create an extra scope), for UB purposes. If people are pulling things out into functions, their code must not become undefined, IMHO.
I imagine this concerns only repr(Rust)
given this repository and that it'd break C stuff.
Could these container_of
style applications be dealt with by borrowing &x
along with &x.b
? I'm thinking along the lines of owning_ref
or maybe some phantom &x
?
I'll note that I want this to be enabled - I'm just not sure how viable it is given existing code.
@Amanieu: In the specific case of intrusive containers, how viable would it be to, instead of having Adaptor<Link>
use an associated type for Container
, instead be implemented on Self = Container
? In that case, get_container
simply becomes self
. As far as I can tell, that looks workable (I've only looked at Cursor
though).
@eternaleye That doesn't work since a Container
may contain an arbitrary number of links Link
s. This is the case when a single object is a member of multiple intrusive collections.
After looking through my code a bit, I don't think it will be affected by this since it uses *const Link
everywhere instead of &Link
. However it is treading a very fine line.
I would argue this is yet another instance of how closely we "trust types", just with opposite "polarity". In the blog post, Niko discussed whether inside the body of a function that takes v: &usize
as argument, that function can trust the target of v
not to be mutated even by unknown code. Here, we discuss whether the caller of such a function that just takes a v: &mut usize
as argument can rely on the function not to touch any memory except for globally reachable memory and the memory range indicated by value and type of its argument. In some sense, these are two sides of the same medal. (In fact, if we translate the function type to a separation logic Hoare triple, the answer to both questions would be "Yes, you can trust": A separation logic triple would give the function the permission to access v
and rely on no conflicting accesses to happen, while also promising to the context that nothing else will be needed or touched by the function. This is called the "frame rule".)
A function that takes a v: &mut usize
and accesses more than just the 4/8 bytes covered by the usize
itself, is very much like a function that takes a raw pointer and turns it into a Box<T>
: The function actually relies on more than is guaranteed by the type system (i.e., by the general contract that covers all safe interfaces). The function is just inherently unsafe to call, and should be marked as such.
It seems to me that whether the compiler is allowed to rely on this assumption ("functions that need more than is given by their arguments are unsafe") for optimizations again depends on whether this is an "unsafe context" or not -- generally, we want such optimizations as there is some potential for producing faster code, but for people writing unsafe code this may be a very dangerous prospect.
I am pretty sure that this is not allowed in C -- see e.g. this formalization of the C object memory model, which clearly forbids the original code.
In Rust, Stacked Borrows proposes to forbid this. Some related issues:
I will close this issue since the above seem to cover the relevant aspects.
It is a very powerful optimization for us to assume that the function
foo
here cannot accessa
orc
:@pcwalton reports that by default in C or C++ -- or at least in LLVM -- it would be legal to have a function like this:
and then to invoke it on
foo(&x.b)
in a similar scenario.Telling LLVM to assume that Rust functions don't play such games appears likely to have a significant impact on our ability to optimize. I'll report some numbers when @pcwalton is more sure of them. :)