rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.82k stars 12.66k forks source link

Implicit calls to `deref` and `deref_mut` are a footgun with raw pointers #131847

Open AngelicosPhosphoros opened 2 days ago

AngelicosPhosphoros commented 2 days ago

I tried this code (which models library code with invariants for all valid instances of A but without all the details):

use std::ops::Deref;
use std::mem::MaybeUninit;

struct A{
    b: MaybeUninit<B>,
}

struct B{
    field: String,
}

impl Deref for A {
    type Target = B;
    fn deref(&self)->&B{
        println!("Called deref!");
        // SAFETY
        // This deref is expected to be called only outside of the current module and field b is private
        // so as long as all instances of A have it initialized before being accessible outside,
        // this code is sound.
        unsafe {
            self.b.assume_init_ref()
        }
    }
}

// Note: while this is `main`, it is intended to show code that used for
// initialization of new instance of A. I omitted wrapping of A in a module
// and using separate factory method for brewity.
fn main() {
    let mut a: MaybeUninit<A> = MaybeUninit::uninit();
    unsafe {
        let p = a.as_ptr();
        // The only way to get pointer to field of pointee.
        let b_ptr = &raw const (*p).b;
        // However, it may cause unintended calls to a deref if we are not vigilant
        // and mistakenly type name of a field of a type our pointee dereferences to.
        let field_ptr = &raw const (*p).field;
    }
}

I expected to see this happen: this code should be rejected because it causes implicit calls to Deref::deref which assumes *p to be initialized.

Instead, this happened: code compiles and prints Called deref! when executed.

Meta

rustc --version --verbose:

rustc 1.84.0-nightly (9322d183f 2024-10-14)
binary: rustc
commit-hash: 9322d183f45e0fd5a509820874cc5ff27744a479
commit-date: 2024-10-14
host: x86_64-unknown-linux-gnu
release: 1.84.0-nightly
LLVM version: 19.1.1
AngelicosPhosphoros commented 2 days ago

I think, it should call derefs only this way:

(**p) <-- Explicitly dereference object again.
(*).deref() <-- Explicitly call deref method.
carbotaniuman commented 2 days ago

While an unfortunate footgun, this is not a soundness issue. Removing the unsafe shows that only the first line is (correctly) exempt from needing an unsafe block, while the second line does require one.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=7545201409c69d43e6b2d9700b04aef3

VorpalBlade commented 2 days ago

It would make sense to have a lint for this still (clippy or rustc).

fmease commented 2 days ago

Might get caught by https://github.com/rust-lang/rust/pull/123239, haven't checked.

compiler-errors commented 2 days ago

This is not unsound, so I'm gonna untag it as such.

traviscross commented 2 days ago

@rustbot labels -C-bug +C-discussion

RalfJung commented 1 day ago

I would argue you asked for trouble by writing that Deref instance. That instance is unsound, it can be used to cause UB in safe code:

fn main() {
    let a = A { b: MaybeUninit::uninit() };
    let val = &a.field;
    println!("{val}");
}

So, the fix is to not write such Deref instances. I am not convinced a lint is justified. (Well, ideally we'd have a lint against unsound code, but alas, that's not possible. ;)

RalfJung commented 1 day ago

because it causes implicit calls to Deref::deref which assumes *p to be initialized.

deref is not allowed to make such assumptions. It is a safe function, and as such must not make any assumptions beyond being given a well-typed argument.

Cc @rust-lang/opsem

ChayimFriedman2 commented 1 day ago

@RalfJung The problem (AFAIK) is not that Rust is unsound. Rust is behaving correctly. The problem is that it's too easy to make mistakes.

Making a safe Deref is fine if the type has an invariant that it's always initialized, but private code may break this invariant temporarily, and it is too easy to invoke Deref by mistake then. Ultimately, we should be careful when writing unsafe code, and the language could assist with that.

AngelicosPhosphoros commented 1 day ago

@RalfJung

I would argue you asked for trouble by writing that Deref instance. That instance is unsound, it can be used to cause UB in safe code:

Your example is possible only inside module that declares A because field b is private. If all ways to acquire instance of A outside of module guarantee that b is initialized, the code is sound.

By your logic, Vec::deref (which calls Vec::as_slice) is unsound because I can write code like this inside alloc::vec module:

let my_vec: Vec<i32> = Vec { len: 200, buf: RawVec::new() };
let slice: &[i32] = &*my_vec;

However, we actually know that deref implementation of Vec is sound because we know that fields len and buf cannot be modified outside of alloc::vec module and that we write only valid values inside the module.

Similarly, A::deref in my example expects to be called only with valid instances. I didn't include all the code for modules and making everything private in the example because we use minimal examples for bug reports.

@ChayimFriedman2 Argues same thing as me, but more eloquently.

RalfJung commented 1 day ago

Yes we use minimal examples, but this was too minimal. ;) The safety invariant and abstraction barrier are a key part of the argument here. Given the example has neither and there were no comments to indicate anything like that, it is reasonable to assume there is no such invariant. (Well, there is a SAFETY comment, but it didn't explain why this invariant is supposed to hold, given that the code clearly breaks it. Vec is inside a carefully designed mod with carefully chosen pub functions. You reduced all that away which then makes it not the same situation as Vec at all.)

Thanks for clarifying!

Would be good to add a comment in main explaining that this is meant to model code inside the library and hence is allowed to temporarily break the invariant.

AngelicosPhosphoros commented 1 day ago

I added comments.