Open aturon opened 10 years ago
We also need better (or maybe just more prominent) documentation on what makes safe unsafe code, for instance what are the exact guarantees that rustc makes to LLVM that we have to uphold, what is UB, what is exception safe, etc.
@reem: http://doc.rust-lang.org/reference.html#behavior-considered-unsafe (the header should really say undefined)
It doesn't have a complete list of UB possible via intrinsics, but other than that it covers nearly everything.
Added possibly relevant: #556
@dtolnay you said in #44932 that we might want to support ptr::null::<Trait>()
at some point. What is your reasoning for that?
To argue the other way: without it, Rust can guarantee that a *const T where T: ?Sized
must have valid metadata — even if the pointer points to an invalid address. There are a some nice things that follow as a result of this guarantee.
One is that we can have object-safe, safe trait methods that take a raw pointer as the self
argument, allowing a sort of poor-man's dynamically-dispatched static methods1:
trait TypeName {
fn type_name(self: *const Self) -> &'static str;
}
impl TypeName for i32 {
fn type_name(self: *const Self) -> &'static str { "i32" }
}
println!((ptr::null::<()>() as *const TypeName).type_name()); // prints `i32`
Without this guarantee, the above wouldn't be possible because calling a method on a trait object with an invalid vtable would segfault (like in Go) or cause undefined behaviour.
Similarly, the size_of_val
and align_of_val
functions in std::mem
could be relaxed to take a raw pointer instead of a reference. Before that could happen, though, we would also need to decide that *const T where T: !Sized
must store the metadata in the pointer itself, and can't have a "thin" representation.
Another nice thing about guaranteeing that vtable pointers will always be valid is that it would enable the null-pointer optimization for Option<*const Trait>
.
1 If you want to test this out, you can check out my raw_pointer_self branch. Just make sure you git checkout c475d8d
, because the latest commit in that branch is broken.
Another nice thing about guaranteeing that vtable pointers will always be valid is that it would enable the null-pointer optimization for
Option<*const Trait>
.
We already mark the vtable as nonnull
, so it'd be UB for you to create a *const Trait
with a null vtable; furthermore, https://github.com/rust-lang/rust/pull/45225 actually does optimize Option<*const Trait>
.
As far as I'm concerned, we already guarantee it anyway, as nothing in the language distinguishes between safe vs raw pointers in terms of DST metadata (e.g. generic conversions between between different kinds of pointers/references with the same unsized "tail" are allowed, even if it's a type parameter, and we guarantee identical sizes and metadata types) and so that makes it the only path forward to custom DSTs to me.
Thanks for the explanation @mikeyhew! I'm glad it is that clear-cut.
I found it helps to understand an interface in Go as equivalent to Option<*const TypeName>
in Rust, where nil
is equivalent to None
and (*T)(nil)
is equivalent to Some(ptr::null::<i32>())
.
@dtonlay Yeah, that's probably a good way to think about it. You could go a bit farther, though: as far as I understand it, pointers in Go are guaranteed to point to a valid value, unless they are nil. So they are like Option<&T>
. (Or, you know, the garbage-collected version of &T
.) And then interface values have a vtable pointer that can also be nil, so to continue the analogy, they are like Option<(&Vtable, Option<&T>)
. Whereas a *const Trait
in Rust is more like a (&Vtable, usize)
because the vtable is always valid, but the data address can be be anything from null to valid to dangling to random::<usize>() as *const u32
. At least, that's what I gather from skimming through tour.golang.org tonight :)
I was trying to wrap my brain around why Go allows interface values to be nil outright. Then I remembered, you need some way to have a value that "isn't there", and since there's no Option
type, nil is the next best thing.
Right now, it can be a bit unwieldy to write unsafe code -- whether skipping bounds checks or working with raw pointers. To some degree this can and will be mitigated through API design, but new language features may help as well. For example, we could consider offering unsafe indexing notation, or allow indexing onto raw pointers, etc.
There are at least two closed RFCs on this topic presenting some good ideas that we were not ready to commit to for 1.0:
Raw reform Unsafe indexing