Open SoniEx2 opened 2 years ago
Your example seems to rely on not using the CStr
any more before the lifetime ends, and exploiting that for other code. The intention is that this is indeed allowed under some conditions, but those conditions are subtle and the precise aliasing model that defines them has not been decided yet. (Stacked Borrows is the current proposal for such a model, but not its final form.)
For example, the following is considered UB:
unsafe fn foo<'a>(ptr: *const c_char, bogus: PhantomData<&'a ()>) {
let first = unsafe { CStr::from_ptr::<'a>(ptr) }.to_str();
bar(first);
}
fn bar(s: &str) {
// we stop using `s` here
unsafe { mutate_ffi_string(); }
}
This is UB because of a special treatment of function arguments, which means their aliasing rules must be upheld until the end of the function, no matter whether they are used or not. Inlining bar
makes the UB go away.
So, until we have decided what exactly the rules and exceptions are for "ending borrows while their lifetime is still ongoing", the documentation cannot bless code that does that, since we might rule out such code later.
Sure, but we'd argue that clearly maintains the borrow - it's still alive in the parent function and whatnot, as first is Copy. (also to_str returns a Result but ok.)
But this is still a backwards compatibility hazard, and there's no reason FromLtPtr
should be considered unsound because of it.
(also, it*)
Depending on unspecified details of the aliasing model is a backwards compatibility hazard. The documentation is written in a way such that if you follow it, you avoid backwards compatibility hazards.
https://github.com/rust-lang/rust/pull/95948 made the situation more clear to avoid steering people into relying on unspecified parts of the model. The previous documentation was considered insufficiently precise by users, which is why we changed it.
we mean this is sound:
let mut x = &mut whatever;
let y = &x.foo;
drop(y);
x.mutate();
and the rest of the rules basically follow from there.
it's more like "while 'a is borrowed" rather than "the whole 'a". the difference is one of those aligns with safe code and the other one is overly restrictive even for 100% unsafe-free code.
e.g.
use std::ffi::CStr;
use std::os::raw::c_char;
fn main() {
fn read<'a>(ptr: *const c_char, bogus: &'a ()) -> String {
unsafe { CStr::from_ptr::<'a>(ptr) }.to_str().unwrap().to_owned()
}
fn write<'a>(ptr: *mut c_char, bogus: &'a mut ()) {
// whatever
}
fn foo<'a>(ptr: *mut c_char, bogus: &'a mut ()) {
let first = read(ptr, &*bogus);
write(ptr, bogus);
let second = read(ptr, &*bogus);
}
}
we mean this is sound:
The drop
does nothing here.
Also in that code the lifetime of the &x.foo
borrow does end when x
stops being used, so this is perfectly fine by the CStr docs.
And FWIW it's not just CStr
; slice::from_raw_parts
has been documented like this for quite a while. CStr
was just recently fixed to be consistent with everything else.
it's more like "while 'a is borrowed" rather than "the whole 'a". the difference is one of those aligns with safe code and the other one is overly restrictive even for 100% unsafe-free code.
"while 'a is borrowed" is not a defined term that we can use in documentation.
let first = read(ptr, &*bogus);
here you are reborrowing bogus
, so the lifetime passed to read
will actually be a short one.
well yes it needs to be reborrowed because it can't be coerced but reborrows are kinda weird. like, this is a reborrow:
fn foo<'a>(&'a self) -> &'a Foo {
&self.inner
}
so it makes sense that whatever these unsafe functions do is also a reborrow.
but how do you talk about reborrows in docs. we should be able to talk about reborrows in docs. the docs for CStr
and slice::from_raw_parts
should reflect the true borrow semantics of safe rust, instead of being more restrictive than them. (unsafe is literally never more restrictive than safe rust - like, that's the whole point of unsafe.)
so: how do we make this better? because what we have today just isn't good enough.
(also, again, it*)
but how do you talk about reborrows in docs. we should be able to talk about reborrows in docs.
Indeed. But we'll need to figure out the precise rules first, and we don't want to document anything that might end up not working with those precise rules.
the docs for CStr and slice::from_raw_parts should reflect the true borrow semantics of safe rust
They do. The borrowing in safe Rust is entirely defined by lifetimes. Anything you could do in safe code is allowed by these docs.
The tricky part is defining the precise borrow semantics in unsafe Rust, and that is work that has started but is not finished yet.
(also, again, it*)
Sorry, but I have no idea what you mean by this comment.
we mean we use it instead of you (2nd person neopronouns).
so uh, would you be willing to consider ltptr
as part of that work?
(I have only been using gender-neutral pronouns, and I am happy to stick to that.)
The example you gave here is something we hope to allow. I don't know which other patterns ltptr
is intended to support, but here are some important factors:
first
, is not used again after mutate_ffi_string
is calledfirst
is passed as an argument to another function, then mutate_ffi_string
is not called during the runtime of that functionIf these are the only rules, the example is fine. The thing is, we don't know yet if that will be all the rules. There might be more. (Though to be honest, that seems unlikely.)
That said, the example does not use ltptr
, so it is hard to tell what these rules mean for ltptr
, or whether other uses of ltptr
would be a problem even with these rules (that I am fairly sure we will need).
we ask you to rephrase these:
Your example seems to rely on not using the
CStr
any more before the lifetime ends, and exploiting that for other code.here you are reborrowing
bogus
, so the lifetime passed toread
will actually be a short one.Sorry, but I have no idea what you mean by this comment.
The example you gave here is something we hope to allow.
to use "it" instead of "you", like so:
Its example seems to rely on not using the
CStr
any more before the lifetime ends, and exploiting that for other code.here it is reborrowing
bogus
, so the lifetime passed toread
will actually be a short one.Sorry, but I have no idea what it means by this comment.
The example it gave here is something we hope to allow.
The same example, using ltptr
, looks like so:
unsafe fn foo<'a>(ptr: ConstLtPtr<'a, c_char>) {
let first = unsafe { CStr::from_lt_ptr(ptr) }.to_str().map(|x| x.to_owned());
unsafe { mutate_ffi_string(); }
let second = unsafe { CStr::from_lt_ptr(ptr) }.to_str().map(|x| x.to_owned());
}
Where from_lt_ptr
is defined here: https://docs.rs/ltptr/0.1.4/ltptr/trait.FromLtPtr.html#impl-FromLtPtr%3Ci8%3E-for-CStr
And we believe this should be okay, even tho the lifetime is clearly 'a
. (Tho, it is covariant we guess.)
(Note also that ltptr
is still a work in progress, so it's missing things like casts and const<->mut conversions. We also can't have explicit unsafe deref like pointers can, sadly. Ideally it'd just be part of the language.)
Location
CStr::from_ptr
Summary
95948 introduced incompatible changes to the
CStr
docs. Namely, the docs now require immutability for the whole lifetime of'a
, instead of just while it's borrowed. See also this discussion on urlo. This is relevant to our crate,ltptr
, which provides bound lifetimes for raw pointers and uses those in conversions: https://docs.rs/ltptr/0.1.4/ltptr/trait.FromLtPtr.html