JRRudy1 / transient

Rust crate providing a reimplementation of the `std::any::Any` trait that supports types with non-`'static` lifetimes.
Other
33 stars 1 forks source link

Update derive macro to allow declaring `T: Transient` instead of `T: 'static` on type parameters #5

Open JRRudy1 opened 5 days ago

JRRudy1 commented 5 days ago

Background

In order to obtain a TypeId for a non-'static type, this crate needs to be able to name a "static version" of the type which is provided by the Transient::Static associated type. For types with only lifetime/const parameters, it is trivial to unambiguously choose this type, both manually and in the derive macro - a type such as S<'a, 'b> would declare its static type as S<'static, 'static>.

For types with generic type parameters, however, this is not so clear-cut. For example, consider this type:

struct S<'a, T>(&'a [T]);

We could try declaring its Transient::Static type as S<&'static, T>, but even though the compiler recognizes &'static [T] as implying that T: 'static in many contexts, it does not make this assumption where we need it: std::any::TypeId::of::<S<'static, T>() is not allowed unless there is an explicit T: 'static bound (nor would it be allowed for a naked &'static [T]). So this is what the derive macro currently does; it adds a T: 'static bound to the Transient impl for all generic type parameters like so:

unsafe impl<'a, T> Transient for S<'a, T>
where
    T: 'static
{
    type Static = S<&'static, T>;
    type Transience = Co<'a>;
}

This works, but what if we wanted our implementation to support types such as S<'a, &'b i32>? This would not be possible using the impl proposed above since the T: 'static bound applies to entire implementation. However, there is another option; instead of using the bound T: 'static in our impl, we could have used T: Transient and then used its associated types in the impl:

unsafe impl<'a, T> Transient for S<'a, T>
where
    T: Transient
{
    type Static = S<&'static, T::Static>;
    type Transience = (Co<'a>, T::Transience);
}

Since &'b i32 implements the Transient trait this new impl now supports it, and would translate to the concrete impl below:

unsafe impl<'a, 'b> Transient for S<'a, &'b i32> {
    type Static = S<'static, &'static i32>;
    type Transience = (Co<'a>, Co<'b>);
}

Great, this works! Of course, the impl is now limited to types that implement the Transient, which would be problematic if you want to support foreign types that don't implement Transient, but at least this gives you another option and you can weigh the trade-offs for you own use case.

Derive Macro Limitation

This brings us back to the point of this Issue: the derive macro currently does not support this pattern and always uses the first approach of adding the T: 'static bound, so types wanting to use the T: Transient approach must implement the trait manually. To address this limitation, the macro should be updated to provide a mechanism for choosing the second approach.

Proposed Design

My current idea for the design of this feature is to mirror the syntax for declaring lifetime variances implemented in PR #4, by introducing a transient helper attribute that lists the type parameters that should follow the T: Transient approach discussed above. This would be used like so:

#[derive(Transient)]
#[covariant(a)]
#[transient(T2)]
struct S<'a, T1, T2> {
    static_bounded: &'a [T1],
    transient_bounded: Vec<T2>
}
// generated impl:
impl<'a, T1, T2> Transient for S<'a, T1, T2> 
where
    T1: 'static,
    T2: Transient,
{
    type Static = S<'static, T1, T2::Static>;
    type Transience = (Co<'a>, T2::Transience);
}
mod __validate_S {
  /* ... */
}
the10thWiz commented 3 days ago

This sounds like a great idea! I do foresee one issue - The transience of S in your example could wind up being (Co<'a>, (Co<'b>, Co<'c>)). I don't believe the simplification rules for CanTransendTo are able to handle nested tuples. To this end, we would need to either add support for nested tuples (which could easily balloon the number of impls), or we need to come up with a way to convert nested tuples to flat tuples during generation. I think adding support for nested tuples is the way to go, since it makes it much easier when Transient needs to be implemented manually.

JRRudy1 commented 3 days ago

Honestly I thought the same thing until I played around with it a bit and realized that the existing impls are more flexible than I thought. Arbitrarily nested tuples (up to length-4) all fall under the macro_rules impls, so (Co<'a>, (Co<'b>, Co<'c>)) is a valid Transience and can transcend pretty flexibly. For example:

struct S<'a, 'b, 'c>(&'a i32, &'b i32, &'c i32);

impl<'a, 'b, 'c> Transient for S<'a, 'b, 'c> {
    type Static = S<'a, 'b, 'c>;
    type Transience = (Co<'a>, (Co<'b>, Co<'c>));
}

fn test<'short, 'medium: 'short, 'long: 'medium>(val: S<'short, 'medium, 'long>) {

    // trivial case, erasing to `(Co<'short>, (Co<'medium>, Co<'long>))`
    let _: &dyn Any<(Co, (Co, Co))> = &val;  // this could be downcast to the original

    // the inner tuple could instead merge to the shorter lifetime giving `(Co<'short>, Co<'medium>)`
    let _: &dyn Any<(Co, Co)> = &val;  // we could only recover `S<'short, 'medium, 'medium>` from this

    // they can even merge to a single `Co<'short>`
    let _: &dyn Any<Co> = &val;  // but not you can only downcast to `S<'short, 'short, 'short>`

   // you could even expand to deeper nested tuples if you had some reason to
  let _: &dyn Any<(((Co, Co), Co), (Co, Co)))`> = &val;  // this could be downcast to the original
}

So it is surprisingly flexible and you can collapse/expand tuples at will, the catch is just that you can only downcast to the least-common-denominator of lifetimes you passed through. So if you have (Co<'short>, (Co<'medium>, Co<'long>)), you can flatten to (Co<'short>, Co<'medium>, Co<'medium>), which is probably fine in most cases, but unfortunately you could never get the (Co<'short>, Co<'medium>, Co<'long>) that you would ideally want.

A while back I messed around with implementing something like that using associated types kinda like the ndarray library does with dimensions, but ran into a minefield of "duplicate impls" and ended up deciding to keep it simple for the time being. But I'm sure there is a way...

The bigger problem with this proposal though is that we have to account for the variance of the type parameters as well, and the (Co<'a>, T2::Transience) I used in that example is only sound because S is covariant w.r.t. to T. This example would lead to UB:

struct S<T>(fn(T));

unsafe impl<T: Transient> Transient for S<T> {
    type Static = S<T::Static>;
    type Transience = T::Transience;
}

Since for a concrete T of &'a str that would expand to this impl:

unsafe impl<'a> Transient for S<&'a str> {
    type Static = S<&'static str>;
    type Transience = Co<'a>;
}

which is definitely wrong since <S<&'a str>>::Transience should be Contra<'a>.

I am working on a mechanism for this, but the main takeaway is that type parameters will need to have their variances declared just like lifetimes, something like:

#[derive(Transient)]
#[covariant(a, T2)]
struct S<'a, T1, T2> {
    static_bounded: &'a [T1],
    transient_bounded: Vec<T2>
}
the10thWiz commented 3 days ago

You're absolutely correct. I was playing around with some expanded examples, and I think I came up with a way to validate that the type meets the specified variance wrt the struct. Basically, we need to create a covariant struct, and use it as the type parameter - trivially struct T<'a>(&'a ());. Using this, with pretty much the same logic as lifetimes, validation works as expected.

From a technical perspective for a type parameter T:

To implement this, I think we would want to add two associated types to Transience - Invariant and Inverse. We should also consider sealing Transience, since other crates likely shouldn't be able to define new transience types. I've created a PR with these changes (but no changes to derive).

JRRudy1 commented 3 days ago

Yes I was thinking along the same lines. The only problem I can see is that somebody might want to implement Transient on a struct with a type parameter than has some additional trait bound that our "test type" wouldn't implement; for example:

#[derive(Transient)]
#[covariant(T)]
struct S<T: MyTrait>(value: T);

Then the generated test function would fail to compile since our test type doesn't implement MyTrait. And I'm afraid the error message would be extremely confusing, even more so than for the variance mismatch.

To implement this, I think we would want to add two associated types to Transience - Invariant and Inverse

I agree, I added the same, as well as a couple type aliases that make them easier to use:

pub type Covariant<T> = <T as Transient>::Transience;
pub type Contravariant<T> = <<T as Transient>::Transience as Transience>::Inverse;
pub type Invariant<T> = <<T as Transient>::Transience as Transience>::Invariant;

You can then use them in the (derived) Transient impl like so:

struct S<'a, T>(fn(T) -> &'a str);

unsafe impl<'a, T: Transient> Transient for S<'a, T> {
    type Static = S<'static, T::Static>;
    type Transience = (Co<'a>, Contravariant<T>);
}

The clash in names is less than ideal, but I don't think its too bad.

JRRudy1 commented 3 days ago

By the way, as justification for the names, you could equivalently write the above transience as (Covariant<&'a ()>, Contravariant<T>), since <&'a () as Transient>::Transience is Co<'a>. So in a way you could think of Co<'a> as a short-hand for Covariant<&'a ()> (although Co<'a> is the fundamental type here). Similarly, Contra<'a> could be interpreted as a short-hand for Contravariant<&'a ()>.

the10thWiz commented 2 days ago

Then the generated test function would fail to compile since our test type doesn't implement MyTrait. And I'm afraid the error message would be extremely confusing, even more so than for the variance mismatch.

A reasonable interim solution might be to simply refuse if the generic type has any trait bounds.

However, I've put together a potential solution, based on using a generic type on the validation function, so we don't need to provide a concrete type. To make it covariant wrt a lifetime we can control, I've used &T as the actual type parameter to the function.

trait MyTrait {}
#[derive(Transient)]
#[covariant(a)]
struct S<T: MyTrait> {
    // contra_val: fn(T), // Fails to compile with this line
    co_val: T,
}
// --- Generated ---
unsafe impl<T: Transient + MyTrait> Transient for S<T>
    where T::Static: MyTrait // Needed to allow the Static decl
{
    type Static = S<T::Static>;
    type Transience = T::Transience;
}
mod __validate_SingleLife {
    use super::*;
    // Type parameter it `&'life T`, so it's covariant wrt `'life`
    fn covariant_wrt_a<'__long, 'a, T>(v: S<&'__long T>) -> S<&'a T>
    where
        '__long: 'a,
        for<'x> &'x T: MyTrait, // Ensures `&T` implements MyTrait for any lifetime
    {
        v
    }
}

I also did a quick sanity check, and was slightly surprised that use super::*; also inherits use statements from the outer scope.

JRRudy1 commented 1 day ago

Yeah that should work, though I'm curious what your motivation is for using &'a T as the test type instead of something like &'a (). Is it just so there's a higher likelihood of the former implementing MyTrait? If we decide to reject any trait bounds then it wouldn't really matter what we use as long as it implements Transient

the10thWiz commented 23 hours ago

Edit: if we use &T, we can allow arbitrary trait bounds.

It's not that there's a higher likelihood, it's that we can add for<'x> &'x T: MyTrait to the where clause, so the compiler will assume it does, at least for the purposes of type checking the function.

We can add a similar bound for &(), but the code will fail to compile if &() doesn't implement the trait.