What about: zero-sized dangling accesses/inbounds-offsets?

RalfJung commented 5 years ago

Is the following code UB or not?

fn main() {
    let mut b = Box::new((((),), 4));
    let x: *mut ((),) = &mut b.0;
    drop(b);
    unsafe {
        // getelementptr inbounds with offset 0 of a dangling pointer
        let x_inner: *mut () = &mut (*x).0;
        // 0-sized access of a dangling pointer
        let _val = *x;
    }
}

On the one hand, x is dangling. On the other hand, doing the same thing with let x: *mut ((),) = 1usize as *mut ((),) would definitely be allowed. Does it make sense to treat dangling pointers as "more dangerous" than integer pointers?

AFAIK the actual accesses do not get translated to LLVM IR, so we are not constrained by LLVM. But we do emit a getelementptr inbounds, and the rules for that with offset 0 are rather unclear, I would say. @rkruppe do you know anything about this?

Sub-threads / points:

strega-nil commented 5 years ago

I personally prefer that accesses of a zero sized type are always valid under raw pointers, and for references to only need alignment.

RalfJung commented 5 years ago

We don't make a difference between raw pointer and reference accesses in terms of alignment for non-zero-sized accesses, why would we for zero-sized accesses?

hanna-kruppe commented 5 years ago

From the LLVM side, it seems rather unambigious to me that the current GEPi specification does not make an exception for offset 0 and so any GEPi on a dangling pointer results in poison. I don't really see a way to change that without hampering analyses based on GEPi: for sanity such a rule should apply to runtime offsets as well as compile time offsets, but then any analysis that wants to draw conclusions from the presence of a GEPi would have to prove that the offset is nonzero at runtime.

One could possibly argue that any pointer is inbounds for ZSTs since it's one past the end of a zero-sized allocation, but I do not believe such a concept exists in LLVM today (and it's unclear to me whether that would be desirable).

How about a different solution? For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

RalfJung commented 5 years ago

I just realized we recently had a related discussion in https://github.com/rust-lang/rust/issues/54857.

so any GEPi on a dangling pointer results in poison

Just to be clear, does "dangling pointer" include 4usize as *mut _, or just pointers that were actually generated by allocation functions known to LLVM and then deallocated?

Cc @eddyb who argued previously that we can rely on such "zero-sized allocations".

For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

This still leaves empty slices, where the offset is only known at run-time.

hanna-kruppe commented 5 years ago

so any GEPi on a dangling pointer results in poison

Just to be clear, does "dangling pointer" include 4usize as *mut _, or just pointers that were actually generated by allocation functions known to LLVM and then deallocated?

Interesting question, I don't know. More precisely the LangRef says the result is poison when "the base pointer is not an in bounds address of an allocated object", so this too runs into the question of zero sized allocations. At the same time, I would be wary of spec-lawyering too much in this context.

For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

This still leaves empty slices, where the offset is only known at run-time.

Ugh, yeah, what a pain.

RalfJung commented 5 years ago

I brought this up on the LLVM-dev list and had a bit of an exchange with one developer over there. They concluded that

I'd argue, after all this discussion at least, use non-inbounds if you do not know you have a valid object (and want to avoid undef and all what it entails).

eddyb commented 5 years ago

How about a different solution? For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

We already do this! It's needed because there are more things with offset 0 than struct-likes: https://github.com/rust-lang/rust/blob/258e3b3a75a0da006cd492307fc46ef605e774ad/src/librustc_codegen_ssa/mir/place.rs#L111-L114

(ugh, why does that kind of linking not work cross-crate, @GitHub?!)

            // Unions and newtypes only use an offset of 0.
            let llval = if offset.bytes() == 0 {
                self.llval
            } else /*...*/;
            PlaceRef {
                // HACK(eddyb) have to bitcast pointers until LLVM removes pointee types.
                llval: bx.pointercast(llval, bx.cx().type_ptr_to(bx.cx().backend_type(field))),

RalfJung commented 5 years ago

We can't always do this for all 0-offsets though, like for &v[i] where i == 0.

eddyb commented 5 years ago

@RalfJung Right, unless i doesn't matter (size_of::<typeof v[i]>() == 0).

RalfJung commented 5 years ago

Sure. However, (&*NonNull::dangling::<[0; i32]> as &[i32])[i] will do a zero-GEPi on a dangling pointer in a not statically detectable way.

HeroicKatora commented 5 years ago

Right, unless i doesn't matter (size_of::<typeof v[i]>() == 0).

Would that means it is safe to conjure any slice length of a ZST out of thin air? Would it make this code UB-free? It seems that this has at least some impact on privacy as it creates some references to T from no references to T and could bypass the checks a usual constructor would have for non-Copy ZST T.

fn arbitrary<'a, T>(length: usize) -> &'a [T] {
    assert!(core::mem::size_of::<T>() == 0);
    unsafe {
        core::slice::from_raw_parts(core::ptr::NonNull::dangling().as_ptr(), length)
    }
}

hanna-kruppe commented 5 years ago

Just because something does not immediately cause UB doesn't mean that it is safe in the sense that it composes with other code. A lot of code (even purely safe code, depending on whether we e.g. consider &! to be uninhabited or possible to construct) requires additional invariants to hold for memory safety to be ensured. There's many other examples lke this too.

RalfJung commented 5 years ago

@HeroicKatora Beyond what @rkruppe said, see this blog post that explains the difference between the kind of "wrong code" that causes immediate UB, vs the kind of "wrong code" that violates a high-level semantic contract and is therefore guilty of making other code have UB. This even older post is also relevant.

HeroicKatora commented 5 years ago

So yes, it is memory safe in that it does not violate the validity invariants of a [ZST]. But what is slightly interesting is that a zero sized type can not expect that its safety invariant is upheld purely by privacy and abstraction, as opposed to non-zero-sized types for which it is not memory safe since already creating guaranteed valid instances involves privacy aspects. That's slightly odd, or am I mistaken and there is a way to create valid (not safe) instances of arbitrary types in code that does not exhibit UB?

RalfJung commented 5 years ago

! is a ZST. So if references require the pointee to be valid, using your function with ! (and length > 0) is insta-UB.

HeroicKatora commented 5 years ago

Ah, good point. I'm asking due to the str-concat crate which we're thinking about expanding to arbitrary slices. In that case, the input would already be a &[T]. The actual code would be slightly different:

fn arbitrary_blowup<'a, T>(slice: &'a [T], length: usize) -> &'a [T] {
    assert!(core::mem::size_of::<T>() == 0);
    assert!(!slice.is_empty());
    unsafe {
        core::slice::from_raw_parts(slice.as_ptr(), length);
    }
}

In the spirit of this code, but maybe part of another issue, is &[!] also uninhabited? The slice of length 0 would not contain any instance of ! and could be inhabited (your comment suggests this but is not explicit).

RalfJung commented 5 years ago

is &[!] also uninhabited

No. It is inhabited by &[] in safe code, as you indicated.

arbitrary_blowup will indeed never cause insta-UB. It exploits that there are only two possible validity invariants for ZST, as there is no data the invariant could depend on: either the invariant is trivially satisfied (such as the one of ()) or it is trivially unsatisfied (such as the one of !).

However, as discussed above, arbitrary_blowup is not a safe function as it can be used to copy non-Copy ZST. (Note that ! is Copy!)

HeroicKatora commented 5 years ago

However, as discussed above, arbitrary_blowup is not a safe function as it can be used to copy non-Copy ZST. (Note that ! is Copy!)

Why would Copy be involved, arbitrary_blowup does not create any instances and &T is Copy. It would also never lead to additional Drop::drop calls for the same reason. (But for &'a mut [T]-variant I might see a point in being much less safe since the references involved are not Copy).

RalfJung commented 5 years ago

Let me put it the other way: arbitrary_blowup is legal if you add a T: Copy bound.

There is no such thing as "creating an instance" of a ZST, but increasing the size of a ZST slice from 1 to 2 is effectively "duplicating" the first element -- aka doing a copy.

HeroicKatora commented 5 years ago

Now I'm confused. All references obtained from the slice are indistinguishable. Non-mutable references are Copy. I'm not duplicating elements but only references to elements, since it does not involve a value of the slice type but only a reference to such a slice. cc @joshlf since zerocopy also creates references to non-Copy types from thin air.

oberien commented 5 years ago

@HeroicKatora A ZST could have a Drop implementation. By creating new elements out of thin air, you add additional Drop implementations. For example a ZST could be used to safely cleanup a static on drop (for whatever reason). By creating a second instance, that code could become UB.

Edit: Adding a T: Copy bound also ensures that the type doesn't have a Drop impl. Edit2: See the comment below, this is wrong.

HeroicKatora commented 5 years ago

It doesn't lead to any new Drop::drop calls, dropping the reference to the created slice is a pure no-op. No ownership of any Ts is involved, and drop requires the T to be owned (rather, you must safely guarantee to be semantically able to create a value of T but for the sake of ptr::drop_in_place and owning DST-types you need not actually do this). If you were to read one of the Ts from the slice, sure you create a new instance, but the read itself ensures Copy since you can not move from the reference.

Edit: sorry, this was a mess at first. I deleted the comment and added a new for anyone following by mail.

Lokathor commented 5 years ago

I also agree that I don't understand the need for T:Copy.

If this were an array then yes you'd need to not make them up out of nowhere, but since this is a shared slice then adding more length to the slice doesn't do anything because when the slice drops it doesn't cause the elements to drop.

Edit: Clarification, I mean in the case of concat for shared slices. Obviously you can't go making up any length that you want for any type that you want, but if you're concatting slices then the only way that you'd get a non-zero length input for one of the slices being concat is if someone gave you an inhabited type or already did UB themselves.

RalfJung commented 5 years ago

Hm I see, I think you are right. A shared slice is just many shared references and having one more makes no difference. I should try to prove that in lambda-rust one day.

For &mut this would not work, because of mem::swap.

But anyway this is all off-topic for a discussion that is solely about offsetting. I should probably tell GItHub to hide all the posts here...

@HeroicKatora next time you have a new question triggered by discussion inside a thread that is not directly related to the thread's topic, please open a new issue!

gnzlbg commented 5 years ago

I’m not sure I follow, arbitrary blowup gets a reference to a slice, and increases the size of the slice, returning a reference to it. AFAICT this creates a reference to values that do not exist.

If the reference validity is not transitive, then it doesn’t matter whether what the reference points to exists or not. But I’d reference validity is transitive, then those new ZSTs at the end of the slice are required to exist somewhere. I don’t see how creating a reference to a ZST creates the ZST itself.

Lokathor commented 5 years ago

Forked to it's own issue: https://github.com/rust-lang/unsafe-code-guidelines/issues/168

RalfJung commented 4 years ago

The semantics in Miri since recently are now as follows: let ptr be the input to the offset operation and ptr_off the output. Then the operation is considered "inbounds" if the memory range ptr..ptr_off (for non-negative offsets) or ptr_off..ptr (for negative offsets) is "valid for memory access".

This reduces the "offset" question to "is this memory range valid for access". For empty ranges of integer pointers, the answer to that currently is "only if the ptr is non-NULL". For dangling non-integer pointers (i.e., for pointers with proper provenance that indicates that the allocation has been deallocated), the answer should always be "no".

RalfJung commented 2 years ago

See https://github.com/model-checking/kani/issues/1489 for a discussion of zero-sized accesses via memcmp, and some references to parts of the C standard that might or might not disallow them.

That discussion is a great argument for not relaxing our rules further, i.e., for still rejecting zero-sized accesses on pointers whose provenance demonstrates that they are dangling / out-of-bounds. We seem to already be teetering on the edge of what is allowed by the C spec for memcmp (and other, similar functions). That will make @chorman0773 happy. ;)

RalfJung commented 1 year ago

Actually I think we are essentially forced to allow zero-sized accesses via any pointer (with the one possible exception of null):

We definitely want to allow them for NonNull::dangling(). These pointers have no provenance.
We really want the general property that if some program is well-defined, then adding more provenance to a byte that doesn't have provenance yet maintains well-definedness.

It follows that NonNull::dangling() equipped with arbitrary provenance (including that of a dead allocation) must still be allowed.

Why do we want point 2? This is that "monotonicity property" that @digama0 and me talked a lot about last year. Basically we want it to be legal to remove the following code:

ptr.write(ptr.read());

Now imagine ptr is a *mut usize. Then ptr.write(ptr.read()) is not a NOP: this will lead 8 bytes from memory, cause UB if any of them is uninit, strip their provenance, and then store the result back. Removing this roundtrip thus needs some justification:

If one byte was uninit, then the original program had UB, so we can do whatever.
If one byte had provenance, then the original code stripped that provenance; the transformed code kept it. Thus the optimized program now has provenance on a byte that used to not have provenance. This is okay because of point 2 above.

chorman0773 commented 1 year ago

The solution to that is that NonNull::dangling() (and ptr::invalid()) doesn't return a pointer with no provenance, but a pointer with what I call null provenance: Valid for accesses of size 0.

RalfJung commented 1 year ago

That's just unnecessary complexity IMO. Also something is very off when there are pointers thatre even less valid than "invalid" pointers (namely those produced by transmuting an integer to a ptr).

RalfJung commented 1 year ago

Maybe I should have phrased this differently: I admit that there are other solutions for this than the one I proposed. What you said is also a solution. But why should we take the more complex solution when there is a simpler one?

I previously agreed with your statement that the current semantics (allowing accesses for no-provenance pointers but not with dangling provenance) means fewer special cases. I no longer agree with that. I realized that implementing the change I am proposing in Miri is trivial, and it just moves an existing special case to a new place, it doesn't create new special cases.

Miri diff

```diff diff --git a/compiler/rustc_const_eval/src/interpret/memory.rs b/compiler/rustc_const_eval/src/interpret/memory.rs index f4e03ad8c59..46490c26913 100644 --- a/compiler/rustc_const_eval/src/interpret/memory.rs +++ b/compiler/rustc_const_eval/src/interpret/memory.rs @@ -404,20 +404,19 @@ fn check_and_deref_ptr( M::ProvenanceExtra, ) -> InterpResult<'tcx, (Size, Align, T)>, ) -> InterpResult<'tcx, Option> { - Ok(match self.ptr_try_get_alloc_id(ptr) { - Err(addr) => { - // We couldn't get a proper allocation. This is only okay if the access size is 0, - // and the address is not null. - if size.bytes() > 0 || addr == 0 { - throw_ub!(DanglingIntPointer(addr, msg)); - } - // Must be aligned. - if check.should_check() { - self.check_offset_align(addr, align, check)?; - } - None + if size.bytes() == 0 { + let addr = ptr.addr().bytes(); + // Zero-sized accesses are okay no matter the provenance. Only alignment and null still + // needs to be checked. + if addr == 0 { + throw_ub!(DanglingIntPointer(addr, msg)); + } + if check.should_check() { + self.check_offset_align(addr, align, check)?; } - Ok((alloc_id, offset, prov)) => { + return Ok(None); + } + let (alloc_id, offset, prov) = self.ptr_get_alloc_id(ptr)?; let (alloc_size, alloc_align, ret_val) = alloc_size(alloc_id, offset, prov)?; // Test bounds. This also ensures non-null. // It is sufficient to check this for the end pointer. Also check for overflow! @@ -449,11 +448,7 @@ fn check_and_deref_ptr( } } - // We can still be zero-sized in this branch, in which case we have to - // return `None`. - if size.bytes() == 0 { None } else { Some(ret_val) } - } - }) + Ok(Some(ret_val)) } fn check_offset_align( ```

chorman0773 commented 1 year ago

I think the main special cases are in lowering, particularily with .offset.

RalfJung commented 1 year ago

This entire idea presupposes that LLVM has a getelementptr nowrap operation. Then lowering offset is trivial, since offset(0) never wraps. (Alternatively if LLVM decides that getelementptr inbounds is always allowed for offset 0 -- generalizing the existing special case that allows adding 0 to null -- that would also be sufficient.)

gereeter commented 1 year ago

Also something is very off when there are pointers thatre even less valid than "invalid" pointers

We do already have pointers less valid than invalid pointers, and the terminology is confusing, because core::ptr::invalid produces pointers that satisfy the pointer validity invariant. Uninitialized pointers are less valid than that. Making invalid return a null provenance instead of no provenance wouldn't complicate things much more than they already are, and there is a simplicity to requiring all pointers to have provenance and being able to do a form of type-based alias analysis around that.

To be clear, I'm not sure if making int-to-ptr transmutes insta-UB is at all on the table, and even if so, I don't think it is worth it at this point in Rust's development. It doesn't seem to buy us much (as long as LLVM gets GEP nowrap or loosens rules) and it does run the risk of making existing, well-reasoned code be UB.

saethlin commented 1 year ago

I think what you're pointing out is "just" a terminology issue, and therefore of a different kind than the no provenance vs provenance for 0 bytes question. It's also a terminology issue I have complained about in the past, and I think we should solve it, but that solving it wouldn't straighten out question that this issue is about.

RalfJung commented 1 year ago

In my view uninitialized pointers are not a thing. The set of pointers is defined by the MiniRust Pointer type and none of them is uninitiailized. Trying to load a pointer from uninit memory is UB, but that just shows that the domain of abstract byte sequences is larger than the domain of pointers. (Similarly, many byte sequences are not a valid boolean, but still the set of booleans has just 2 elements: true and false. There is no "uninit boolean".)

I agree if you consider the domain of byte sequences, then indeed the name ptr::invalid is somewhat confusing.

there is a simplicity to requiring all pointers to have provenance and being able to do a form of type-based alias analysis around that.

I don't think things are actually becoming any simpler. In MiniRust terms, we'd have

struct Pointer(Address, Provenance);

(instead of the current Option<Provenance>), but Provenance would now internally be defined as something like Option<AllocId> since it needs to contain a "null provenance".

gereeter commented 1 year ago

I think what you're pointing out is "just" a terminology issue

I was more intending the terminology issue as a vehicle for saying that no matter which way you think of things, we don't have to add some intermediate pointer that is actually a pointer yet less valid than core::ptr::invalid; we can instead declare int-to-ptr transmutes UB. Sorry for the lack of clarity.

the domain of abstract byte sequences is larger than the domain of pointers.

This is definitely a nicer way of thinking about things.

I don't think things are actually becoming any simpler.

Representation-wise, you're definitely right, and "simpler" is perhaps not the right word. I more meant that being able to look at the state of memory and clearly identify all the pointers is a nice property to have, and might be convenient for some optimizations. (But then zero-sized provenance is mostly moot for alias analysis anyway, so maybe this doesn't matter.) And it isn't any more complicated, either.

(I don't necessarily support this change so much as worry it was dismissed too quickly, since in my view it at least isn't worse in terms of explanatory and implementation complexity.)

CAD97 commented 1 year ago

we don't have to add some intermediate pointer that is actually a pointer yet less valid than core::ptr::invalid; we can instead declare int-to-ptr transmutes UB. Sorry for the lack of clarity.

This isn't the contentious case, though. The contentious case is instead the deallocation of pointers, and a zero-sized access after deallocation, e.g.

let layout = Layout::new::<u64>();
let ptr = alloc(layout);
if ptr.is_null() { return; }
dealloc(ptr, layout);
let r: &() = &*(ptr as *mut ()); // or `ptr.byte_offset(0)` equivalent

It's this deallocated provenance which is potentially less valid than the no/null provenance you get from ptr::invalid, completely unrelated to transmutes; presumably this access is allowed for ptr::invalid.

So, actually, I'd say the actual alternative would be treating ptr::invalid the same as deallocated pointers, considering zero-sized offset/access UB for it as well. We'd then have some ptr::zst_alloc(1) which acts more like 1 as *mut (), allowing zero-sized offset/access.

For clarity, let me name a couple test and production cases. The first two tests and the first production are what the OP lines out, but acknowledging the other cases I think helps clarify what we're actually trying to model. [playground]

Test Cases

#### inbounds zero-byte offset ```rust #[repr(transparent)] struct S(()); let _x: *mut () = addr_of_mut!((*(ptr as *mut S)).0); ``` Consider also `intrinsics::offset`/`<*mut _>::offset` and array-index-place-computation. #### inbounds zero-byte access ```rust let _x: () = *(ptr as *mut ()); ``` #### outbounds zero-byte offset ```rust let ptr = ptr.wrapping_byte_offset(isize::MAX); let _x: *mut () = addr_of_mut!((*(ptr as *mut S)).0); ``` #### outbounds zero-byte access ```rust let ptr = ptr.wrapping_byte_offset(isize::MAX); let _x: () = *(ptr as *mut ()); ```

Production Cases

#### real deallocated ```rust let layout = Layout::new::(); let ptr = alloc(layout); dealloc(ptr, layout); ptr as *mut () ``` #### zero deallocated ```rust let layout = Layout::new::<()>(); let Ok(ptr) = Global.allocate(layout) else { handle_alloc_error(layout) }; Global.deallocate(ptr.cast(), layout); ptr as *mut () ``` #### real/zero allocated For completeness, consider the previous cases but without the deallocation. #### `ptr::from_exposed_addr` ```rust ptr::from_exposed_addr_mut(3) ``` Currently implemented as an int2ptr cast. #### `ptr::invalid` ```rust ptr::invalid_mut(3) ``` Currently implemented as an int2ptr transmute. #### int2ptr cast ```rust 3 as *mut () ``` #### int2ptr transmute ```rust transmute(3usize) ``` #### assume alloc ```rust ptr::assume_alloc(3) ``` Not a thing currently, but discussed as a way to purposely say "trust me, there's something here" for bare metal. -----

The matrix for the three different models that I understand as having been proposed are below. If you don't think this accurately reflects all of the discussed options, please do name a new option and give its matrix.

As implemented by Miri (2023-02-36 d962ea5)

	real alloc	real dealloc	zero alloc	zero dealloc	ptr from exposed addr	ptr invalid	int2ptr cast	int2ptr transmute	assume alloc
inbounds offset	✔️	❌	✔️	✔️	✔️	✔️	✔️	✔️	-
inbounds access	✔️	❌	✔️	✔️	✔️	✔️	✔️	✔️	-
outbounds offset	❌	❌	✔️	✔️	✔️	✔️	✔️	✔️	-
outbounds access	❌	❌	✔️	✔️	✔️	✔️	✔️	✔️	-

Note: there's not any MIR for pure place computation, even causing unsafeck to overlook unused place computation; as such, let _ = ... will spuriously succeed, thus the tests using let _x instead to force the computation to actually happen.

Zero is always acceptable

	real alloc	real dealloc	zero alloc	zero dealloc	ptr from exposed addr	ptr invalid	int2ptr cast	int2ptr transmute	assume alloc
inbounds offset	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️
inbounds access	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️
outbounds offset	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️
outbounds access	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️	✔️

Invalid means invalid

	real alloc	real dealloc	zero alloc	zero dealloc	ptr from exposed addr	ptr invalid	int2ptr cast	int2ptr transmute	assume alloc
inbounds offset	✔️	❌	✔️	❔	〰️	❌	〰️	❌	✔️
inbounds access	✔️	❌	✔️	❔	〰️	❌	〰️	❌	✔️
outbounds offset	❌	❌	❔	❔	❌	❌	❌	❌	❓
outbounds access	❌	❌	❔	❔	❌	❌	❌	❌	❓

❔ depends; could act like alloc or like fake_alloc ❓ very confusing; "just access the hardware address" plus a non-hardware-present operation 〰️ assuming that valid provenance has been exposed here

I think making ZST allocations "actual" and thus access-after-dealloc UB could make sense for consistency under this model. If an Allocator tracks an actual memory block for ZSTs (not currently forbidden), then it has an actual memory block to track for that allocated object. On the other hand, this feels needlessly antagonistic (see also the linked wg-alloc issue). If the Allocator does treat zero-sized allocations as real allocations, then that maybe should be baked into the global allocation interface as well; it depends also on whether GlobalAlloc is intended to be the #[global_allocator] interface forever, or if Allocator is intended to supersede it at some point.

RalfJung commented 1 year ago

Note: there is currently no MIR for ZST accesses, and thus Miri doesn't have the capability to treat them as UB.

This is not accurate, Miri finds the UB in this example.

CAD97 commented 1 year ago

Ah, whoops, crossed some wires I guess, and since I was testing with 1ZST didn't disprove my mismemory... and I forgot to actually make it a read instead of just a place inspection when building the matrix. Fixed the access entries for Miri for that mistake. (Out of bounds and deallocated ZST access is diagnosed as UB even while aligned; it was previously incorrectly marked as accepted.)

saethlin commented 1 year ago

We have a MIR optimization which deletes ZST assignments, and it's on in debug builds: https://github.com/rust-lang/rust/blob/5157d938c49af1248a21e7ed2fbc5c6f71963276/compiler/rustc_mir_transform/src/remove_zsts.rs#L11

The only way to see MIR that contains ZST assignments is to pass -Zmir-opt-level=0 (which Miri does internally) or to pass the flag explicitly, which you cannot do on the playground (stuff like this is why I use godbolt instead).

RalfJung commented 1 year ago

A potential counterpoint here is: what should the compiler be allowed to deduce from a *val? If val could have a zero-sized type, the proposed rules would say "nothing"; even if we know val points to some local we can't even deduce that the local must still be live. Though OTOH if we know the local we could just check whether that local is live...

(This came up in https://github.com/rust-lang/rust/pull/106285. I think that optimization has a better alternative, but maybe this comes up again in the future.)

chorman0773 commented 1 year ago

So, my opinion, from the perspective of a compiler writer?

Getting out of the way a couple of rust-level properties

First, I believe that *val being well-defined must necessarily mean val.add(1) is well-defined. Slice references make this property pretty much

Second: I believe that val.add(n) and val.cast::<u8>().add(n*size_of_static(n)), when both are well-formed, should be equivalent expressions after appropriate conversions to make both types the same. That is, offsetting a pointer by n elements of type T and n*size_of::() bytes should have the same behaviour. The current docs imply this is the case, and I think this makes sense from a consistency perspective (For a non-ZST T, this property holds).

Third: val.offset(constant) and val.offset(nonconstant) should have the same behaviour, when nonconstant and constant compare equal (and nonconstant has no other side effects - think nonconstant is black_box(constant)). I think this property should be self-explanatory, but it also makes sense - you can rewrite this as val.cast::<[T;constant]>().offset(1) and core::ptr::slice_from_raw_parts(val,nonconstant).magic_slice_offset(1). IMO, both rewritten expressions should be equivalent (ignoring return types and metadata).

The property I'd like to have hold is val.cast::<u8>().add(n) implies (null_or_)dereferenceable(n) (I have no issues with null specifically being allowed for n=0, for reasons I'll detail below). One thing I'd like the ability to do is to allow conditional speculation of a read that is n-bytes long (maybe a copy_nonoverlapping, or a byte-loop), into a possible null check (which may be easy to reduce to the equivalent of a cmov) and a variable-sized read. On traditional cpus, this could allow n to become known later, or convert into a instruction that can do variable-sized reads. Throwing away the possibility of an invalid pointer or a pointer-to-function, means that only a null check prevents more interesting inferences. If nonnull-ness becomes known some other way, this would activate those more interesting inferences.

On Fri, Mar 24, 2023 at 06:42 Ralf Jung @.***> wrote:

The counterpoint here (which came up in #106285) is: what should the compiler be allowed to deduce from a *val? If val could have a zero-sized type, the proposed rules would say "nothing"; even if we know val points to some local we can't even deduce that the local must still be live. Though OTOH if we know the local we could just check the results of MaybeStorageLive to check whether it is live...

(This came up in rust-lang/rust#106285 https://github.com/rust-lang/rust/pull/106285. I think that optimization has a better alternative, but maybe this comes up again in the future.)

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/unsafe-code-guidelines/issues/93#issuecomment-1482595104, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGLD23DZI6H2FQA74NVUSDW5V3AFANCNFSM4GZXQF3Q . You are receiving this because you were mentioned.Message ID: @.***>

RalfJung commented 1 year ago

First, I believe that *val being well-defined must necessarily mean val.add(1) is well-defined. Slice references make this property pretty much

I can't see any good reason for this principle. I would say *val being well-defined must mean that val.add(size_of_val_raw(val)) (EDIT: should be byte_add) must be well-defined -- that makes a lot of sense. But demanding at least an offset of 1 when we have zero-sized types is symmetry-breaking.

EDIT: Oh wait, you used add, not byte_add. Sorry I misunderstood.

In that case I do agree, but I think it doesn't help to resolve the question here. val.add(1) adds 0 if the pointee is zero-sized.

chorman0773 commented 1 year ago

offset (and add) in Rust is defined in terms of elements of T, not bytes. The property list I wrote was based on that.

val.add(size_of_val_raw(val)) is most definitely not implied to be well-defined. val.add_bytes(size_of_val_raw(val)) (where val.add_bytes(n) is defined as val.cast::(n)) is defined, by combination of properties 1 and 2.

On Fri, Mar 24, 2023 at 10:29 Ralf Jung @.***> wrote:

First, I believe that *val being well-defined must necessarily mean val.add(1) is well-defined. Slice references make this property pretty much

I can't see any good reason for this principle. I would say *val being well-defined must mean that val.add(size_of_val_raw(val)) must be well-defined -- that makes a lot of sense. But demanding at least an offset of 1 when we have zero-sized types is symmetry-breaking.

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/unsafe-code-guidelines/issues/93#issuecomment-1482897279, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGLD2Z2JXSPCF3HMJEO2QDW5WVVFANCNFSM4GZXQF3Q . You are receiving this because you were mentioned.Message ID: @.***>

RalfJung commented 1 year ago

(Yes I realized that after posting, see the edits in my prior post)

chorman0773 commented 1 year ago

Ah yes, I saw the comment by email and replied by email, so I didn't see the edit.

In that case I do agree, but I think it doesn't help to resolve the question here. val.add(1) adds 0 if the pointee is zero-sized.

That's what the "This is what I would like" is there for, along with the 3rd property. I'd like for val.byte_add(n) allow me to infer null_or_dereferenceable(n), to permit the speculation I noted above. With the third property, val.byte_add(0) would necessarily permit inferring null_or_dereferenceable(0).

RalfJung commented 1 year ago

If we allow *val for any pointer for ZST, then that means that any pointer is dereferenceable(0), so the implication trivially holds.

chorman0773 commented 1 year ago

This seems like it might conflict with some of the more interesting inferences I'd like to make, though I'm trying to figure out what those interfences might be.

An obvious one would be pruning branches that lead to such an access to a deallocated pointer.

one thing I'm considering is replacing it with a fixed sized load in machine code emission when well-aligned, though I'm currently having a fun time justifying that on a ZST access. Along with actually using a pload instruction.

rust-lang / unsafe-code-guidelines