rust-lang / nomicon

The Dark Arts of Advanced and Unsafe Rust Programming
https://doc.rust-lang.org/nomicon/
Apache License 2.0
1.75k stars 258 forks source link

confusion about PhantomData & T: 'a #348

Open leddoo opened 2 years ago

leddoo commented 2 years ago

My current understanding of PhantomData<T> is: You need it if your struct "contains/owns" a T, even though there is no field to reflect that fact.

But neither the nomicon nor the rust docs of PhantomData seem to confirm this: If PhantomData truly indicated containment to the compiler, why does the Drain impl for the example vec use a lifetime bound for T?

pub struct Drain<'a, T: 'a> {
    vec: PhantomData<&'a mut Vec<T>>, // this should imply `T: 'a`
    iter: RawValIter<T>,
}

In the past, you needed T: 'a, but that has since been removed by rfc2093. Instead, the presence of a &'a T field indicates that T: 'a. So if PhantomData behaved like a field of the same type, the lifetime bound should not be necessary, because it is inferred transitively, all the way from RawVec.

The PhantomData docs explicitly state that T: 'a is required:


struct Slice<'a, T: 'a> {
    start: *const T,
    end: *const T,
    phantom: PhantomData<&'a T>,
}

"This also in turn requires the annotation T: 'a, indicating that any references in T are valid over the lifetime 'a."

What makes it even more confusing is that there is a pretty consistent pattern: The nomicon, the rust docs, and the rust std source code use T: 'a when implementing reference-like types, but not for owning types like Vec.

Is T: 'a ever required if PhantomData is used? I think it would be great if the nomicon could provide a definite answer to this question. And if PhantomData obsoletes T: 'a, it would be very useful to include some info about this being a thing of the past.

sgasse commented 1 year ago

I got curious about this and tried out a few examples in the playground. To me it seems that you are right @leddoo that T: 'a is no longer required to be spelled out. But I ran into a curious example that I wanted to share about PhantomData, references and ownership:

use std::marker::PhantomData;

#[derive(Debug)]
struct Slice<'a, T> {
    ptr: *const T,
    phantom: PhantomData<&'a T>,
}

fn main() {
    // This works because T: 'a is implied, thus 'scoped: 'scoped, which is true
    {
        let my_string = String::from("Meow");
        let borrow = &my_string;                            // borrow: &'scoped String
        let s = Slice{ptr: &borrow, phantom: PhantomData};  // ptr: &'scoped &'scoped String
        dbg!(s.ptr);
    }

    // As expected, the following does not compile:
    // T: 'a is implied -> 'scoped: 'outer, which is not true
    // let s = {
    //     let my_string = String::from("Meow");
    //     let borrow = &my_string;                   // borrow: &'scoped String
    //     Slice{ptr: &borrow, phantom: PhantomData}  // ptr: &'outer &'scoped String
    // };
    // dbg!(s.ptr);

    // The following example is a little surprising. It compiles fine.
    let s = {
        let my_string = String::from("Meow");
        Slice{ptr: &my_string, phantom: PhantomData}  // I guess
                                                      // ptr: &'outer String
                                                      // but who owns `my_string`?
    };
    dbg!(s);
    // Debug info:
    // s = Slice {
    //     ptr: 0x00007ffe557fad20,
    //     phantom: PhantomData<&alloc::string::String>,
    // }
}

(link)

Does PhantomData virtually own my_string in this case despite just holding a reference to it in the field phantom?