implied bounds for trait objects

scalexm commented 5 years ago

We need to design rules for the well-formedness of trait objects that address the soundness issue https://github.com/rust-lang/rust/issues/44454 and that are future-proof under implied bounds.

In what follows, we'll write dyn Trait to actually denote dyn Trait + 'a for a placeholder lifetime 'a, the rules will work regardless of 'a.

Trait objects in rustc

Say we have the following trait:

trait Foo<T1...Tk> where WC1, ..., WCn { }

We will assume that this trait has no methods -- methods only change a bit the definition of "object-safety". Then we say that the trait is object-safe if:

Self does not appear as a non-self type parameter in the where clauses WC1, ..., WCn or in the where clauses of any super-trait of Foo<T1...Tk>
Sized is not a transitive super-trait of Foo<T1...Tk>

Then dyn Foo<...> is considered well-formed if Foo is object-safe. Example of object-safe traits:

trait Bar { } // Object-safe
trait Baz { } // Object-safe
trait Foo: Bar + Baz { } // Object-safe

Example of non-object safe traits:

trait Bar: Sized { } // Not object-safe
trait Baz: Bar { } // Not object-safe: `Sized` is an (indirect) super-trait of `Baz`

// Not object-safe: `Self` appears as a non-self type parameter in the where clauses
trait Foo where Self: PartialEq<Self> { }

// Not object-safe: `Self` appears as a non-self type parameter in the where clauses
// of a super-trait of `FooBar`.
trait FooBar: Foo { }

Then, whenever a trait Foo<T1...Tk> is object-safe, for each super-trait SuperTrait<...> of Foo<T1...Tk> (including Foo<T1...Tk> itself), the compiler unconditionally generates the following impl:

impl<T1...Tk> SuperTrait<...> for dyn Foo<T1...Tk> { }

Why the previous approach is unsound

The unsoundness comes from the generated impls above. For example, dyn Foo<&'a i32> in this example is a well-formed trait object:

trait Foo<X>: 'static { } // Object-safe

// The compiler generates this impl:
// impl<X> Foo<X> for dyn Foo<X> { }

fn bar<'a>() {
    let x: Box<dyn Foo<&'a i32>>;
}

and we have that dyn Foo<&'a i32>: Foo<&'a i32> even if dyn Foo<&'a i32> does not outlive 'static.

Another example which may seem weird:

trait Bar<X> where X: Copy { }

fn main() {
    // Ok even if `String` does not implement `Copy`.
    let x: Box<dyn Bar<String>>;
}

Note that with this X: Copy example, you cannot do anything useful because Rust does not support fully general implied bounds for now, however we must keep this example in our head if we want do design something which works with implieds bounds.

However, because Rust does support implied bounds for outlive requirements, here is what you can do with the first example:

trait Foo<X>: 'static { }

trait Projector {
    type Item;
}

impl<X> Projector for dyn Foo<X> {
    type Item = X;
}

fn only_static<T: 'static>() {
    // Ultimately, we managed to call this function with `T == &'a i32` which does
    // not outlive `'static`.
}

fn only_foo<X, T: Foo<X> + Projector + ?Sized>() {
    only_static::<<T as Projector>::Item>()
}

fn bar<'a>() {
    only_foo::<&'a i32, dyn Foo<&'a i32>>()
}

(playground)

This works because in only_foo, we bound over T: Foo<X>, so we get an implied bound T: 'static as per the definition of trait Foo<X>. And because outlive requirements are syntactical, if T: 'static, then rustc deduces that <T as Projector>::Item must be 'static.

Starting from there, we may confuse the compiler enough to create memory unsoundness (see some examples in https://github.com/rust-lang/rust/issues/44454).

Proposed rules for a prototype in chalk

The problem with the previous approach comes from the fact that the generated impls are unconditional, and that we can name types that should obviously not be well-formed. Basically, a dyn Foo<X> is just an existential type ?Self such that ?Self: Foo<X> holds. So we may legitimately assume that all things talking directly about ?Self are true (e.g. that ?Self implements all super-traits of trait Foo<X>). However, because we have lost the information of where exactly the type parameter X appears in ?Self (note that it may not appear at all), we cannot assume anything neither about X nor about any outlive requirements on ?Self.

Here are a set of rules which I think solve the problem. First, we keep the definition of object-safety, and we'll say that the special domain goal ObjectSafe(Trait) holds if trait Trait<T1...Tk> is object-safe. We may either encode the rules for ObjectSafe(Trait) in the logic or pre-compute it for each trait, I believe either method works.

Now, suppose we have the following trait:

trait Foo<T1...Tk> where WC1, ..., WCn { }

and let SuperTrait<...> be any transitive super-trait of Foo<T1...Tk>. If SuperTrait<...> comes from the following definition:

trait SuperTrait<X1...Xm> where SuperWC1, ..., SuperWCr { }

then we generate the following impl:

impl<T1...Tk> SuperTrait<...> for dyn Foo<T1...Tk> where SuperWC1[...], ..., SuperWCr[...]
{
}

where SuperWC1[...] just means that we have substituted SuperWC1 from the trait definition with the right type parameters. Note again that in the set of the super-traits of Foo<T1...Tk>, we have Foo<T1...Tk> itself.

Finally, we have the following rule for the well-formedness of dyn Foo<X1...Xk>:

WellFormed(dyn Foo<X1...Xk>) :- 
    ObjectSafe(Foo),
    WellFormed(dyn Foo<X1...Xk>: Foo<X1...Xk>).

I believe that this solution can be implemented in rustc today.

Note that, in the rules described earlier, we added where clauses from the super traits on every generated impl only so that the impl is well-formed even without implied bounds, I am not sure they are strictly needed.

Examples

In the above rules, the fact that we generate an impl of SuperTrait<...> for dyn Foo<T1...Tk> encodes the previous claim:

we may legitimately assume that all things talking directly about ?Self are true

while the fact that we are retaining where clauses from SuperTrait<...> on the generated impl encodes the other claim:

we cannot assume anything neither about type parameters nor about any outlive requirements on ?Self

Let's walk through some examples to make that clearer.

trait Bar { } // Object-safe
// Generated impl is:
// impl Bar for dyn Bar { }
//
// WF rule is:
// `WellFormed(dyn Bar) :- ObjectSafe(Bar), WellFormed(dyn Bar: Bar)`
// (always holds)

trait Foo: Bar { } // Object-safe
// Generated impls are:
// impl Bar for dyn Foo { }
// impl Foo for dyn Foo where dyn Foo: Bar { }
//
// WF rule is:
// `WellFormed(dyn Foo) :- ObjectSafe(Foo), WellFormed(dyn Foo: Foo)`
// (always holds)

trait Foo<X>: 'static { } // Object-safe
// Generated impl is:
// impl<X> Foo<X> for dyn Foo<X> where dyn Foo<X>: 'static { }
//
// WF rule is:
// `WellFormed(dyn Foo<X>) :- ObjectSafe(Foo), WellFormed(dyn Foo<X>: Foo<X>)`

In the above, the well-formedness of dyn Foo<X> only holds if X: 'static or more generally if you can prove that dyn Foo<X>: 'static.

trait Foo<X> where X: Copy { } // Object-safe
// Generated impl is:
// impl<X> Foo<X> for dyn Foo<X> where X: Copy { }
//
// WF rule is:
// `WellFormed(dyn Foo<X>) :- ObjectSafe(Foo), WellFormed(dyn Foo<X>: Foo<X>)`

In the above, dyn Foo<i32> is WF but not dyn Foo<String>.

Implied bounds

Because of the well-formedness rule, we naturally have the following implied bound:

FromEnv(dyn Foo<T1...Tk>: Foo<T1...Tk>) :- FromEnv(dyn Foo<T1...Tk>)

This permits writing things like:

trait Foo<X> where X: Hash { }

// We don't have to repeat `X: Hash` here.
fn foo<X>(x: &dyn Foo<X>) {
}

With this reverse rule and implied bounds, the where clauses from super-traits on the generated impls are definitely not needed.

A curious example

Today in Rust, you can have the following trait:

trait Bar { }

trait Foo where Box<Self>: Bar { }

The trait Foo is object-safe and dyn Foo is well-formed. However, the following code is not accepted:

// Note that the `where Box<T>: Bar` where clause would not be needed with implied bounds.
fn foo<T: Foo + ?Sized>() where Box<T>: Bar {
}

fn main() {
    foo::<dyn Foo>()
//  ^^^^^^^^^^^^^^ the trait `Bar` is not implemented for `Box<dyn Foo>`
}

However, we may add the following (manually written) impl:

impl Bar for Box<dyn Foo> { }

In which case we can now call foo::<dyn Foo>().

With my proposal, what would happen is that in the absence of impl Bar for Box<dyn Foo>, then dyn Foo would not be considered well-formed, because Box<Self>: Bar is not really a super-trait. With impl Bar for Box<dyn Foo>, then everything works again.

Question is: should we try to extend the generated impls to all bounds, so that we don't need to write the impl Bar for Box<dyn Foo> impl? Note that generating the impls for all the transitive super-traits must be done outside of the logic, but is easy. However, generating impls for all bounds would require some kind of fixed-point algorithm. Moreover, this may be even harder for traits with bounds on associated types, like in:

trait Bar {
    type Item;
}

trait Foo: Bar where <Self as Bar>::Item: Copy { }

fn main() {
    // Note that this compiles today, but wouldn't under my proposal.
    let x: &dyn Foo<Item = String>;
}

Multi-trait objects

Note that this proposal naturally extends to multi-trait objects. For example, you'd have:

WellFormed(dyn Foo + Bar) :- 
    ObjectSafe(Foo),
    ObjectSafe(Bar),
    WellFormed(dyn Foo + Bar: Foo),
    WellFormed(dyn Foo + Bar: Bar).

while the generated impls would come from super-traits of both Foo and Bar.

nikomatsakis commented 5 years ago

While reading this, I found myself wondering about RFC 2027, which aims to tweak the rules around dyn Trait being "well-formed" a bit. Might be relevant. Have to go revisit the RFC first to decide =)

scalexm commented 5 years ago

@nikomatsakis I guess for that RFC, we could just move the ObjectSafe condition from the WF rules onto the generated impls (in the where clauses of the generated impls, presumably)

Edit: wouldn't work because we require WellFormed(dyn Foo) :- WellFormed(dyn Foo: Foo) and WellFormed(dyn Foo: Foo) would not hold if the trait is not object safe

nikomatsakis commented 5 years ago

@scalexm and I discussed RFC 2027 on Zulip here. We came to the conclusion (roughly around here) that it should be compatible, but we have to be a bit careful with the setup.

rust-lang / chalk