rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.63k stars 12.49k forks source link

GATs: Decide whether to have defaults for `where Self: 'a` #87479

Open nikomatsakis opened 3 years ago

nikomatsakis commented 3 years ago

Coming from the "missing required bounds" error?

The "missing required bounds" error that was emitted is intended to ensure that current code is forwards-compatible with potential future changes. We'll provide a brief summary of the why the bounds are required, the potential future changes, and a workaround if the required bounds are too restrictive. We would appreciate any feedback for this issue.

Why are these bounds required

Let's start by imagining a trait:

trait Iterable {
    type Item<'x>;
    fn iter<'a>(&'a self) -> Self::Item<'a>;
}

As-is, nothing looks wrong about this; without the "missing required bounds" error, this would compile. However, let's try to add an impl:

impl<T> Iterable for T {
    type Item<'a> = &'a T;
    fn iter<'a>(&'a self) -> Self::Item<'a> { self }
}

This definition of Item<'a> = &'a T is invalid, since we don't know that T outlives 'a. So, our first thought might be to modify the definition to add a where Self: 'a clause. This is what we want. However, impls are not allowed to have more where clauses than the trait.

Luckily, we can detect in a trait when this type of problem might occur. To do this, we look at the trait methods that construct the GAT. There, we find the bounds that we know and require that those bounds be written on the GAT itself.

Here, on the Iterable trait, we construct the GAT Self::Item<'a> in the iter method. There, we have an argument &'a self, which allows us to know Self: 'a. Therefore, we require the where Self: 'a bound on the GAT.

Potential future changes

Following the above logic, if we know all the required bounds when the trait is written, why require them at all? In fact, allowed these bounds to be implied is a potential future change. However, to make this change in a backwards-compatible manner, we must require the bounds now that we want to eventually imply. At that time, the written bounds will be redundant and can be removed.

This breaks my code. Workaround?

First, if any code breaks from adding the required bounds, we really want feedback. Second, the workaround is to move the GAT into a super trait. Using the example above, our new code would look like:

trait IterableSuper {
    type Item<'x>;
}
trait Iterable: IterableSuper {
    fn iter<'a>(&'a self) -> Self::Item<'a>;
}

Previous discussion

What is this bug?

We are moving towards stabilizing GATs (tracking issue: https://github.com/rust-lang/rust/issues/44265) but there is one major ergonomic hurdle that we should decide how to manage before we go forward. In particular, a great many GAT use cases require a surprising where clause to be well-typed; this typically has the form where Self: 'a. It might be useful if we were to create some rules to add this rule by default. Once we stabilize, changing defaults will be more difficult, and could require an edition, therefore it's better to evaluate the rules now.

I have an opinion! What should I do?

To make this decision in an informed way, what we need most are real-world examples and experience reports. If you are experimenting with GATs, for example, how often do you use where Self: 'a and how did you find out that it is necessary? Would the default proposals described below work for you? If not, can you describe the trait so we can understand why they would not work?

Of great use would be example usages that do NOT require where Self: 'a. It'd be good to be able to evaluate the various defaulting schemes and see whether they would interfere with the trait. Knowing the trait and a rough sketch of the impls would be helpful.

Background: what where clause now?

Consider the typical "lending iterator" example. The idea here is to have an iterator that produces values that may have references into the iterator itself (as opposed to references into the collection being iterated over). In other words, given a next method like fn next<'a>(&'a mut self), the returned items have to be able to reference 'a. The typical Iterator trait cannot express that, but GATs can:

trait LendingIterator {
    type Item<'a>;

    fn next<'b>(&'b mut self) -> Self::Item<'b>;
}

Unfortunately, this trait definition turns out to be not quite right in practice. Consider an example like this, an iterator that yields a reference to the same item over and over again (note that it owns the item it is referencing):

struct RefOnce<T> {
    my_data: T    
}

impl<T> LendingIterator for RefOnce<T> {
    type Item<'a> where Self: 'a = &'a T;

    fn next<'b>(&'b mut self) -> Self::Item<'b> {
        &self.my_data
    }
}

Here, the type type Item<'a> = &'a T declaration is actually illegal. Why is that? The assumption when authoring the trait was that 'a would always be the lifetime of the self reference in the next function, of course, but that is not in fact required. People can reference Item with any lifetime they want. For example, what if somebody wrote the type <SomeType<T> as LendingIterator>::Item<'static>? In this case, T: 'static would have to be true, but T may in fact contain borrowed references. This is why the compiler gives you a "T may not outlive 'a" error (playground).

We can encode the constraint that "'a is meant to be the lifetime of the self reference" by adding a where Self: 'a clause to the type Item declaration. This is saying "you can only use a 'a that could be a reference to Self". If you make this change, you'll find that the code compiles (playground):

trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next<'b>(&'b mut self) -> Self::Item<'b>;
}

When would you NOT want the where clause Self: 'a?

If the associated type cannot refer to data that comes from the Self type, then the where Self: 'a is unnecessary, and is in fact somewhat constraining. As an example, consider:

XXX finish this

What could we do about it?

There are a few options. Here is the list of ideas we've had so far.

  1. Status quo: require that people add the where Self: 'a bounds, and try to do better with diagnostics.
  2. Simple, limited default: If a GAT has exactly one lifetime parameter 'a, add where Self: 'a to both traits and impls. Need some way to opt out.
  3. More extensive defaults: e.g., for every lifetime parameter 'a to a GAT, add where Self: 'a, and maybe where T: 'a for type parameters too. Need some way to opt out.
  4. Add a syntactic sugar for this common case, e.g. type Foo<'self>. This could be added later.
  5. self-oriented defaults: Given some GAT Foo<'a>, if each use Self::Foo<'b> within the trait methods references a 'b that is the lifetime parameter for self, then add where Self: 'b. While kind of complex to explain, this captures the intuition that 'a is meant to be the "lifetime of the self reference" in practice. We probably still want a way to opt out (though maybe not; maybe that way is "don't use &'a self notation").
  6. Even smarter defaults A: Look at the method signatures in the trait. If we find that each use of Self::Item<'b> is associated with a lifetime 'b where Self: 'b is implied by the method arguments, then infer that where Self: 'b. This is a more extensive, general version of self-oriented defaults. We probably still want a way to opt out (though maybe not).

In general, once we stabilize GATs, we likely cannot add defaults, except via an edition -- although we may be able to get away with it in this instance if the defaults are smart enough.

nikomatsakis commented 3 years ago

cc @rust-lang/lang

cramertj commented 3 years ago

Of great use would be example usages that do NOT require where Self: 'a. It'd be good to be able to evaluate the various defaulting schemes and see whether they would interfere with the trait. Knowing the trait and a rough sketch of the impls would be helpful.

One of the ones that seems most obvious to me is when the output borrows not from self, but from a parameter to the method, such as in the case of a parser:

trait Parser {
    type Output<'a>;
    fn parse<'a>(&mut self, data: &'a [u8]) -> Self::Output<'a>;
}

Parser output wouldn't usually reference state internal to the parser, but it very well may reference the input data. In practice, you can express this instead with for<'a> Parser<'a>, but it's often the case that lifetime-only GATs can be replaced with HRTB-based solutions. I think it might be a goal of ours to allow people to avoid HRTBs for these types of things-- if it is, then this use-case is worth considering.

jplatte commented 3 years ago

For another example where Self: 'a would be wrong, in Ruma we have a trait IncomingRequest that I've long wanted to use GATs for, which would be another example of this. Simplified:

// Every endpoint in the Matrix REST API has two request and response types in Ruma, one Incoming
// (used for deserialization) and out Outgoing (used for serialization). To avoid annoying clones when
// sending a request, most non-copy fields in the outgoing structs are references.
//
// The request traits have an associated type for the corresponding response type so things can be
// matched up properly.
pub trait IncomingRequest: Sized {
    // This is the current definition of the associated type I'd like to make generic.
    type OutgoingResponse: OutgoingResponse;
    // AFAICT adding a lifetime parameter is all I need.
    type OutgoingResponse<'a>: OutgoingResponse;

    // Other trait members... (not using Self::OutgoingResponse)
}

(full definition here if sbd. is curious)

My 2c regardingthe potential solutions: type Foo<'self> looks really nice.

ShadowJonathan commented 3 years ago

I'm also in favour of keeping status quo and adding the implicit 'self lifetime, it balances the need for being explicit with DRY, is easy to hint at by the compiler (for one scenario or another), and easy to reference by the book & other resources.

In general I think the compiler shouldn't try to be "smart" about this, I'm still a learning rustacean, but having the compiler try to infer multiple lifetimes by default, and implicitly, seems confusing and footgun-y to me.

ammkrn commented 3 years ago

I have a use case of the kind @cramertj referenced that I ended up working around with macros, but I would like to try and remove them when GATs are available. Also for cramertj, I couldn't figure out how to implement your example with HRTBs, so if you can actually do that I'd be interested to learn how it's done.

My specific use case is a little more complex, but I think the idea is relevant to many applications of indirect data structures. I have tree structures that are managed with type/memory safe integer pointers:

struct Ptr<'a, A> {
  id: usize,
  marker: PhantomData<&'a A>

The backing storage struct looks like:

struct Store<'a> {
  a_set: IndexSet<ThingA<'a>>,
  b_set: IndexSet<ThingB<'a>>,
  ..
  n_set: IndexSet<ThingN<'a>>,
}

The program is a type checker, so there may be multiple stores alive with lifetimes 'a > 'b > .. 'n, but they nest, so we're always dealing with the "outermost" one (with the shortest lifetime). The guarantees I'm currently able to enforce within Rust's type system are that a Ptr<'a, A> can't outlive its backing Store<'a>, and that you can only create a Ptr<'a, A> if you actually have a Store<A>. The macro-based solution runs really well, but GATs would be much more satisfying. The GAT solution I originally wanted was to have a trait for the stored types so that the output's lifetime was dependent on the lifetime of the backing store that was passed as an argument. As long as the underlying data is alive at least as long as the backing store I've been passed, I can both read a Ptr<'a, A> -> A and allocate an A -> Ptr<'a, A>.

trait IsStored<'x>: Sized {
    type Carrier<'a>;
    fn read<'a, S: Store<'a>>(self, storage: &mut S) -> Self::Carrier<'a>
    where 'x: 'a ;
    fn alloc<'a, S: Store<'a>>(self, storage: &mut S) -> Ptr<'a, Self::Carrier<'a>>
    where 'x: 'a;
}

In this case, it seems like type Carrier<'a> where Self: 'a would be incorrect, but I wrote up a basic implementation of it with the syntax proposed here and it seems to compile on nightly, I guess because the actual self lifetime is longer than the 'a bound, though I'm sure there are programs that would benefit from similar indirect structures in which that wouldn't always be the case.

Also I have no idea if this is possible, but in playing around with this feature with the thought of making stuff generic over lifetimes, it would be cool if there was a way to add sugar or something to enable this:

trait Thing {
    fn f<'a>(self, ctx: &'a Ctx) -> Self<'a>;
}

// Desugars to something like:
impl<'x> Thing for A<'x> {
  type Assoc<'a> = A<'a>;
  fn f<'a>(self, ctx: &'a Ctx) -> Self::Assoc<'a> { .. }
}
P-E-Meunier commented 3 years ago

GATs are probably the only feature I've been missing in Rust since 1.0, I'm delighted to see progress on that.

@nikomatsakis, I do have an example where Self: 'a isn't always the case: in Sanakirja, an on-disk key-value store I wrote, memory management is a bit messy. I'd also like to extend it by implementing other transactional datastructures in the future.

Since I don't know the kind of details yous need, here is a quick summary of how Sanakirja works:

Now, there's a choice about the lifetime of values retrieved from the database: we may choose to have them borrow the transactions, or to have them borrow individual tables. In the former case, a GAT may look like:

trait TableIterator {
  type Iter<'txn, K, V>: Iterator<(&'txn K, &'txn V)>;
  fn iter(&'txn self, db: &'txn Db<K, V>) -> Self::Iter<'txn, K, V>;
}

In the latter case, we'll have the following instead:

trait TableIterator {
  type Iter<'db, K, V>: Iterator<(&'db K, &'db V)>;
  fn iter(&'txn self, db: &'db Db<K, V>) -> Self::Iter<'db, K, V>;
}

This could work with an extra mechanism to make sure Db isn't alive after the end of a transaction, for example.

As a conclusion, I don't have a strong opinion about what the default should be, since I understand this is a rather niche use case. I haven't actually tried to implement these using nightly + GAT, but I could try if yous need more feedback.

eira-fransham commented 3 years ago

I feel like you're always going to want to allow where Self: 'a even if you don't always require it, so why not stabilise it with the requirement that you must always annotate and add elision later as a separate RFC? That's what we did for lifetime elision in impl specs (e.g. impl Trait for &'_ Foo). This is now, as far as I know, the only remaining blocker for GATs being stabilised so it seems silly to hold that up while we bikeshed this considering that there exists a MVP solution that would be trivial to improve on later in a backwards-compatible way.

nikomatsakis commented 3 years ago

Hi all! I'm reading these responses and trying to summarize.

@P-E-Meunier what would be useful is to see not only the trait definition, but prospective impl definitions

@ammkrn the same! What does an impl for the trait IsStored look like? You mentioned that you prototyped your setup on nightly, can you link to that on the Rust playground perhaps?

ammkrn commented 3 years ago

@nikomatsakis

I put it in a gat_nightly branch, here are the impls for one base type and its pointer. https://github.com/ammkrn/nanoda_lib/blob/4c46ec08cb80e12de1cf5a312f17c3fd67feb228/src/utils.rs#L370 https://github.com/ammkrn/nanoda_lib/blob/4c46ec08cb80e12de1cf5a312f17c3fd67feb228/src/utils.rs#L409

This application's state management is somewhat involved for performance reasons, but I think this issue boils down to what I believe is a GAT use case that's been discussed quite a bit, which is generalizing over the lifetime of a container.

ctx is the telescope of stores,'a is the lifetime of the outermost/shortest-lived backing store, and the one we have mutable access to. 'x is any lifetime equal to or longer than 'a. The two ideas are: "given a pointer to an object known to live at least as long as the shortest-lived store, I can return the pointed-to data, though I can only guarantee it lives as long as the shortest-lived store", and "given some data, I can allocate it and return a pointer, but I can only promise the pointer is good for the lifetime of the shortest-lived store".

or: read: ∀ 'x >= 'a, (Ptr<'x>, Store<'a>) -> Data<'a> alloc: ∀ 'x >= 'a, (Data<'x>, Store<'a>) -> Ptr<'a>

jgarvin commented 3 years ago

I ran into a simple case using GATs where I didn't need the clause. I have a trait representing mutable pointers so that I can write generic functions that work both with regular pointers and compressed pointers. Since it's for pointers, there are no lifetimes.

pub trait MutPointer<T>:
{
    type Cast<U>: MutPointer<U>;

    fn cast<U>(self) -> Self::Cast<U>;

    // ... other less relevant methods
}

This is analogous in C++ to std::static_pointer_cast and related functions which work both on regular pointers and smart pointers like shared_ptr and unique_ptr. One problem that still exists with this is that MutPointer<U> could still be some other smart pointer type when it should really be the same generic with a different type plugged in (in other words you shouldn't be able to implement the trait such that shared_ptr<T> is castable to unique_ptr<U>, only shared_ptr<U>). I think doing it right might require HKTs. Something like:

pub trait MutPointer<P<T>>:
{
    fn cast<U>(self) -> P<U>;
}
dbdeviant commented 2 years ago

I recently came across a scenario where I didn't need the clause. The library performs async serialization for any given user provided format (basically async serde). In my case, only the serializer trait needed the Self lifetime annotation, as the returned future from the method holds on to a reference to the data until complete. The deserializer owns its own internal buffers, and as such just defines an associated function for which no Self reference is needed.

In this case, it would seem that option 6 would be necessary to properly infer the type. (Full disclosure: new rustacean here, so there might be a better way to do this.)

My type definitions:

/// Serialize a data structure asynchronously.
pub trait AsyncSerialize: Encodable
{
    /// The concrete [future](Future) returned by the `serialize` method.
    type Future<'w, F, W>: Future<Output=Result<(), F::Error>> + Unpin
    where
        Self: 'w,
        F: 'w + FormatSerialize,
        W: 'w + AsyncWrite + Unpin,
    ;

    /// Attempt to serialize the type asynchronously for the indicated [format](Format)
    /// via the provided [asynchronous writer](AsyncWrite).
    fn serialize<'w, F, W>(&'w self, format: &'w F, writer: &'w mut W) -> Self::Future<'w, F, W>
    where
        F: FormatSerialize,
        W: AsyncWrite + Unpin,
    ;
}

/// Deserialize a data structure asynchronously.
pub trait AsyncDeserialize: Decodable
{
    /// The concrete [future](Future) returned by the `deserialize` method.
    type Future<'r, F, R>: Future<Output=Result<Self, F::Error>> + Unpin
    where
        F: 'r + FormatDeserialize,
        R: 'r + AsyncRead + AsyncBufRead + Unpin,
    ;

    /// Attempt to deserialize the type asynchronously for the indicated [format](Format)
    /// via the provided [asynchronous reader](AsyncRead).
    fn deserialize<'r, F, R>(format: &'r F, reader: &'r mut R) -> Self::Future<'r, F, R>
    where
        F: FormatDeserialize,
        R: AsyncRead + AsyncBufRead + Unpin,
    ;
}
madsmtm commented 2 years ago

I have a use case (that may be similar to the others people have noted, if so, apologies).

A sealed marker trait Ownership with two implementors Owned and Shared is used in objc_id inside a reference-counted type Id<T, O: Ownership>, to specify whether you have mutable or immutable access to the data.

In some cases I would like to consume the Id and return the appropriate reference (bound to some other type, in this case AutoreleasePool); either an &mut T for Owned data or &T for Shared data.

This doesn't require Self: 'a.

trait Ownership {
    type Reference<'a, T: 'a>;

    unsafe fn as_ref_pool<'p, T: 'p>(
        pool: &'p AutoreleasePool,
        ptr: *mut T,
    ) -> Self::Reference<'p, T>;
}

enum Owned {}

impl Ownership for Owned {
    type Reference<'a, T: 'a> = &'a mut T;

    unsafe fn as_ref_pool<'p, T: 'p>(
        pool: &'p AutoreleasePool,
        ptr: *mut T,
    ) -> Self::Reference<'p, T> {
        // SAFETY: Bound by function signature (bound to pool)
        // Pointer validity is up to caller
        &mut *ptr
    }
}

enum Shared {}

impl Ownership for Shared {
    type Reference<'a, T: 'a> = &'a T;

    unsafe fn as_ref_pool<'p, T: 'p>(
        pool: &'p AutoreleasePool,
        ptr: *mut T,
    ) -> Self::Reference<'p, T> {
        // SAFETY: Bound by function signature (bound to pool)
        // Pointer validity is up to caller
        &*ptr
    }
}
nikomatsakis commented 2 years ago

Thanks @madsmtm! I believe your use case will work fine with the current plan.

nikomatsakis commented 2 years ago

Brief update:

The current plan is to adopt a conservative stance. We'll apply an analyses to determine when we believe that a default ought to be applied. If we find that a default would be appropriate, we will check if that where clause is written explicitly. If not, we will error.

In other words, we are not doing defaults, but we are forcing users to write the where clauses that we would have done by default.

At the same time, we'll include an informational note asking people to give feedback on whether these where clauses caused them problems in an issue. We'll also explain the workarounds to disable the default if it is a problem (so folks are not blocked in the interim).

After some time, we can opt to either make the defaults truly defaulted or take some other action.

My expectation is that there will be no problems with the defaults: they are quite narrowly targeted and the chance of false positive seems to me to be remote. But better safe than sorry!

cuviper commented 2 years ago

I was just experimenting with GAT for rayon's with_producer, but it needs an HRTB callback, and I can't see how to constrain that. Here's an artificial self-contained example: playground

#![feature(generic_associated_types)]

use std::iter::Map;
use std::slice::Iter;

pub trait WithStrings {
    type Strings<'a>: Iterator<Item = String>
    where
        Self: 'a;

    fn with_strings<R>(self, f: impl FnOnce(Self::Strings<'_>) -> R) -> R;
}

impl<T: ToString, const N: usize> WithStrings for [T; N] {
    type Strings<'a>
    where
        Self: 'a,
    = Map<Iter<'a, T>, fn(&T) -> String>;

    fn with_strings<R>(self, f: impl FnOnce(Self::Strings<'_>) -> R) -> R {
        f(self.iter().map(T::to_string))
    }
}

(I'm aware this could use array into_iter(), but that's not the point.)

error[E0311]: the parameter type `T` may not live long enough
  --> src/lib.rs:20:8
   |
14 | impl<T: ToString, const N: usize> WithStrings for [T; N] {
   |      -- help: consider adding an explicit lifetime bound...: `T: 'a +`
...
20 |     fn with_strings<R>(self, f: impl FnOnce(Self::Strings<'_>) -> R) -> R {
   |        ^^^^^^^^^^^^ ...so that the type `[T; N]` will meet its required lifetime bounds

I can't put Self: 'a or T: 'a on that AFAIK, or is there a way with explicit for<'a>?

But I kind of imagine a "default" Self: 'a could insert itself in that implicit HRTB somehow.

eira-fransham commented 2 years ago

@cuviper I believe that T: 'static is the only correct bound, since the function is for<'a> and so it’s only valid if &'a T is valid for any 'a, including 'static. You could add explicit lifetime bounds on the method in the trait definition to relax that bound though.

cuviper commented 2 years ago

@Vurich I'm not sure about GATs, but I don't think that's true of for<'a> in general. Iterator::filter has a similar case with its P: FnMut(&Self::Item) -> bool, and this does not require Self::Item: 'static.

I can't use a lifetime parameter on the method though, because it needs a local borrow -- like the self.iter() in my example.

jackh726 commented 2 years ago

GATs issue triage: blocking. I'm currently working on the lint. At minimum, we want to land it before stabilization. But we should consider if we want to do a crater run (not sure how helpful it will be), we should double check against the examples posted here, and we should consider how long we feel like we have to let it sit before stabilizing GATs. But otherwise, hopefully lint will be ready soonish™.

nikomatsakis commented 2 years ago

@rfcbot fcp merge

The currently plan (implemented in #89970, modulo some ongoing discussion):

I'm therefore moving to FCP that decision for the lang team.

rfcbot commented 2 years ago

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot commented 2 years ago

:bell: This is now entering its final comment period, as per the review above. :bell:

cramertj commented 2 years ago

I have another example that doesn't want this default, though it's really just another case of the kind I described before where the Self type is some kind of function mapping inputs to outputs that may reference the lifetime-generic input. This one is something reminiscent of async closures:

pub trait FnOnceWithPinnedArg<Arg> {
    type Output;
    type Fut<'a>: Future<Output = Self::Output>
    where
        Arg: 'a;
    fn call<'a>(self, arg: Pin<&'a mut Arg>) -> Self::Fut<'a>;
}

impl<Arg, F, Output> FnOnceWithPinnedArg<Arg> for F
where
    F: for<'a> FnOnce<(Pin<&'a mut Arg>,), Output: Future<Output = Output>>,
{
    type Output = Output;
    type Fut<'a>
    where
        Arg: 'a,
    = <F as FnOnce<(Pin<&'a mut Arg>,)>>::Output;
    fn call<'a>(self, arg: Pin<&'a mut Arg>) -> Self::Fut<'a> {
        (self)(arg)
    }
}

The full example doesn't work correctly because you can't explicitly write out the lifetimes of closures, but here's a playground showing most of it: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=2cc46a762c3c9064cd7ad7bdd1dc4ba5

jackh726 commented 2 years ago

I have another example that doesn't want this default, though it's really just another case of the kind I described before where the Self type is some kind of function mapping inputs to outputs that may reference the lifetime-generic input. This one is something reminiscent of async closures:

The current algorithm should handle this correctly!

rfcbot commented 2 years ago

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

dbdeviant commented 2 years ago

This ended up breaking the code that I mentioned above, and I don't (readily) see any way to disable the error. Is there a way to get this to compile again?

For instance:

error: Missing required bounds on SerializeUnit
  --> diny_core/src/backend/format.rs:54:5
   |
54 |     type SerializeUnit<'w, W>: Future<Output=Result<(), Self::Error>> + Unpin where W: 'w + io::AsyncWrite + Unpin;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
   |                                                                                                                   |
   |                                                                                                                   help: add the required where clauses: `, Self: 'w`
jackh726 commented 2 years ago

If the required bounds actually break code (in terms of usage, not just in terms of requiring new bounds), then please file an issue since that's a case we should track.

The "workaround" here is to move the associated type into a super trait without any functions.

JanBeh commented 2 years ago

I'm getting an (unexpected) compiler error which refers to this issue:

#![feature(generic_associated_types)]

trait X<'a> {
    type Y<'b>;
    // when I add the following line,
    // it is required that 'b outlives 'a
    fn foo(&self) -> Self::Y<'static>;
    // why?
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error: missing required bound on `Y`
 --> src/lib.rs:4:5
  |
4 |     type Y<'b>;
  |     ^^^^^^^^^^-
  |               |
  |               help: add the required where clause: `where 'b: 'a`
  |
  = note: this bound is currently required to ensure that impls have maximum flexibility
  = note: we are soliciting feedback, see issue #87479 <https://github.com/rust-lang/rust/issues/87479> for more information

error: could not compile `playground` due to previous error

Is this compiler error related to the this issue? What I actually want to do is the following (workaround provided as comments), which also does not compile:

#![feature(generic_associated_types)]

trait X<'a> {
    type Y<'b>;
    //type StaticY; // = Y<'static>
    //fn foo(&self) -> Self::StaticY;
    fn foo(&self) -> Self::Y<'static>;
    fn bar<'b>(&self, arg: Self::Y<'b>);
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error: missing required bound on `Y`
 --> src/lib.rs:4:5
  |
4 |     type Y<'b>;
  |     ^^^^^^^^^^-
  |               |
  |               help: add the required where clause: `where 'b: 'a`
  |
  = note: this bound is currently required to ensure that impls have maximum flexibility
  = note: we are soliciting feedback, see issue #87479 <https://github.com/rust-lang/rust/issues/87479> for more information

error: could not compile `playground` due to previous error

I'm not sure if this is a compiler bug. See also discussion "Weird bounds demanded for GAT" on URLO, particularly @QuineDot's comment:

[…], why have a 'static-only-using GAT instead of a non-G associated type, unless you're planning to expand in the future and want to do so backwards-compatibly?

The actual use case (with current workaround) is this:

pub trait Machine<'a> {
    type Datum<'b, 'c>
    where
        Self: 'b;
    type Function<'b>: for<'c> Function<Datum<'c> = Self::Datum<'b, 'c>>
    where
        Self: 'b;
}

pub trait Function {
    type Datum<'c>;
    /* … */
}

pub trait Module<'a, 'b>: Sized {
    type Machine: Machine<'a>; // TODO: replace with `type Datum<'c>;`, when compiler is fixed
    fn open(&self, name: &str) -> Result<Self, MachineError>;
    fn get(
        &self,
        name: &str,
    ) -> Result<<Self::Machine as Machine<'a>>::Datum<'b, 'static>, MachineError>;
    fn set<'c>(
        &self,
        name: &str,
        datum: <Self::Machine as Machine<'a>>::Datum<'b, 'c>,
    ) -> Result<(), MachineError>;
}

This works, but when I use type Datum<'c>; as GAT instead, I'll run into the described problem.

nikomatsakis commented 2 years ago

@JanBeh I'm a bit surprised by that message; @jackh726 may be able to explain it.

jackh726 commented 2 years ago

My guess is we see <Self as X<'a>::Y<'static>, so we ask "do we know that 'static: 'a?", to which the answer is "yes", so we then require that 'b: 'a.

Probably the right thing to do is to just not have any required bounds for when 'static is used.

nikomatsakis commented 2 years ago

Ah, interesting.

QuineDot commented 2 years ago

I'd also note that the presence of

    fn bar<'b>(&self, arg: Self::Y<'b>);

does not "cancel" the lint/implied bound, whereas the presence of

    fn quz(&self) -> Self::Y<'a>; // 'a from trait parameter

which if I understand correctly, only implies 'b: 'b, does cancel the lint/implication.

Should bar also make the implied bounds indeterminate, since it involves the GAT but doesn't also imply 'b: 'a (or 'static: 'a)?

jackh726 commented 2 years ago

Only GATs returned by trait methods participate in the lint. The rationale is that types returned by a method are actually constrained to be "derived" from the input types, whereas that argument doesn't apply to the input args. (Niko and I did talk about this and I don't think it was necessarily an open and shut case - open to discussion here)

jackh726 commented 2 years ago

@JanBeh I have a fix for your issue in the PR above.

JanBeh commented 2 years ago

@JanBeh I have a fix for your issue in the PR above.

Many thanks for looking into it. My code compiles now.

However, due to to other constraints I'll keep using type Machine: Machine<'a>; (with Self::Machine as Machine<'a>>::Datum<'b, 'c>) instead of type Datum<'c>; as GAT, because I cannot do:

pub trait HasModules<'a>: Machine<'a> {
    // 'c will be undefined and not sure how to use a universal quantifier here:
    type Module<'b>: Module<'a, 'b, Datum<'c> = Self::Datum<'b, 'c>>
    where
        Self: 'b;
    fn module<'b>(&'b self, name: &str) -> Result<Self::Module<'b>, MachineError>;
}

Anyway, that's a different problem, and I don't think I need it solved because my workaround above isn't that bad (and makes some other things easier, so maybe it isn't even a workaround but the canonical thing to do).

101arrowz commented 2 years ago

I think this change broke some lifetime checks, as seen in #97515.

QuineDot commented 1 year ago

The compiler points to this issue soliciting feedback. The issue should be re-opened, or the error should point to some other issue where feedback is welcome (or the error message should otherwise be changed to indicate feedback is no longer welcome).

2 |     type Bar<'a>;
  |     ^^^^^^^^^^^^-
  |                 |
  |                 help: add the required where clause: `where Self: 'a`
  |
  = note: this bound is currently required to ensure that impls have maximum flexibility
  = note: we are soliciting feedback, [see issue #87479 <https://github.com/rust-lang/rust/issues/87479>](https://github.com/rust-lang/rust/issues/87479) for more information

Only GATs returned by trait methods participate in the lint. The rationale is that types returned by a method are actually constrained to be "derived" from the input types, whereas that argument doesn't apply to the input args. (Niko and I did talk about this and I don't think it was necessarily an open and shut case - open to discussion here)

This no longer seems to be the case.

nikomatsakis commented 1 year ago

Agreed! We are still soliciting data.

dhedey commented 1 year ago

I've just been pointed to this issue, where the suggested where Self: 'a bound was incorrect - so I thought it worth noting it here as an additional example.

Roughly speaking, there was a template object, and there was a method to create an instance from the template and install a runtime reference into it.

Adding a bound that the Template outlives the Runtime was too restrictive and caused issues elsewhere.

pub trait Template {
    type Instance<'r>;

    fn create_with_runtime<'r>(
        &self,
        runtime: &'r Runtime
    ) -> Self::Instance<'r>;
}

Personally, I prefer explicit code, so I'd rather require a where Self: 'a bound is explicitly specified where necessary. I think the bound is rather natural to need to add, and the syntax is actually quite nice.

If the bound was added automatically, there wouldn't even be an obvious link to this issue with the workaround, in the case this bound were not required.

yshui commented 1 year ago

I have another case where this bound is undesirable. Although this might be somewhat similar to https://github.com/rust-lang/rust/issues/87479#issuecomment-952205597

Imagine we have an AsIterator trait for things whose references can be converted to an iterator:

pub trait AsIterator {
    type Item<'a> where Self: 'a;
    type Iter<'a>: Iterator<Item=Self::Item<'a>> where Self: 'a;
    fn as_iter(&self) -> Self::Iter<'_>;
}

In a function that uses this trait, we might want to be able to say something about the items returned:

fn bound<T>() where T: for<'a> AsIterator<Item<'a> = &'a u32> {

}

However, this essentially puts a T: 'static requirement because of this Self: 'a bound. This is also somewhat related to: https://sabrinajewson.org/blog/the-better-alternative-to-lifetime-gats

A workaround suggested in the article is:

pub trait IteratorItem<'a> {
    type Item;
}
pub trait AsIterator {
    type Item: ?Sized + for<'a> IteratorItem<'a>;
    type Iter<'a>: Iterator<Item=<Self::Item as IteratorItem<'a>>::Item> where Self: 'a;
    fn as_iter(&self) -> Self::Iter<'_>;
}

And use type Item = dyn for<'a> IteratorItem<'a, Item = &'a u32>; in impl, basically emulating lifetime GAT with HRTB trait objects, to workaround the required where Self: 'a bound.

QuineDot commented 1 year ago

Ran into this again today in the context of abstracting over owning or borrowing a known-'static type.

// Does not compile due to this lint
trait Get {
    type Ref<'a>;
    fn get(&self) -> Self::Ref<'_>;
}

impl Get for Owning {
    type Ref<'a> = Ret<'a>;
    fn get(&self) -> Self::Ref<'_> {
        Ret(&self.value)
    }
}

impl<'a> Get for Borrowing<'a> {
    type Ref<'any> = Ret<'a>;
    fn get(&self) -> Self::Ref<'a> {
        self.owned.get()
    }
}

And I would like to stress again that the workaround which maintains GAT ergonomics is unintuitive and pollutes the trait, and yet is still preferable to the "official" workaround of a lifetime carrying trait in my estimation.

 trait Get {
     type Ref<'a>;
     fn get(&self) -> Self::Ref<'_>;
+    fn _silence_incorrect_lint(_: &Self::Ref<'_>) {}
 }

Therefore, a better way to disable the lint / inference should be designed and implemented before the inference is finalized, assuming the inference isn't dropped.

jackh726 commented 1 year ago

@dhedey

Can you please post more information about the issues you ran into? A small snippet of code is not very helpful to understand the full problem.

@yshui

The problem you ran is because of a bug. Once fixed, the where Self: 'a bound should be fine.

@QuineDot

Having trouble figuring out the problem here. Yes, the lint fires and might seem at first glance wrong, but following the compiler's suggestions leads this code to compile fine. Is there a larger piece of code that demonstrates a problem?

dhedey commented 1 year ago

@jackh726 - Sure. I'll try to describe the setup, in case some additional context helps.

We're running single-threaded. We had a setup where we were creating WASM instances. Many of these instances were going to be the same code, but needed to be created with clean state. These instances were therefore created from a template. Upon instantiation, these templates captured a reference to a runtime, which they could then use for sys-calls. We were trialling a few different WASM runners, so were abstracting with traits over their common interfaces.

Each WASM instance we create starts off with some code, with the process being: take the WASM code, instrument it and prepare it to get a Template. Then, when you want to run an instance, you call a method on the template to instantiate the template, passing a reference to the runtime.

This returns an instance which captures the reference, and thus has a lifetime which is shorter than the runtime - and, importantly, independent of the Template. (The template's lifetime is irrelevant after each instantiation, although typically, is actually longer than the runtime, but could potentially be shorter).

We therefore produce this trait - and importantly, Instance<'r> should not have a where Self: 'r bound - because, as mentioned, the lifetime of the template is not related to the lifetime of the instance.

pub trait Template {
    type Instance<'r>: Instance;

    fn create_with_runtime<'r>(
        &self,
        runtime: &'r Runtime
    ) -> Self::Instance<'r>;
}

It could be that I'm misunderstanding some corner case, or using GATs wrong, if so, happy to be corrected - although I imagine other people may want to do something similar in future.

nikomatsakis commented 1 year ago

@dhedey I made a standalone version of your code...

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c45d7fe1d38cf8b8c69bafeec266e7c9

...just to be clear, though, this pattern compiles today (and hence would not be affected by any default that matched the required bounds), right?

jackh726 commented 1 year ago

Right @dhedey, I think this is fine. The title of this issue is not 100% accurate, because it's exactly requiring where Self: 'a, but rather more "intuitively" requiring what can already be proven when using the GAT (and gets used in the GAT projection).

For your example, the anonymous lifetime in &self doesn't get passed to Self::Instance, so we don't require the Self: 'r bound to be written.

dhedey commented 1 year ago

@nikomatsakis , @jackh726 - ahh sorry, I didn't think to use the playground. I think in the interests of simple exposition I overly simplified the previous code - there was another lifetime on eg MyTemplate<'a> which tripped things up.

So I've taken your playground link and updated it to something which only compiles if where Self: 'r is not included: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=30ef9ff7bb5d02862fb7ccc488fcb7de

But - as you rightfully point out, it compiles fine without the where Self: 'r bound, and there doesn't appear to be a lint telling me to add it in the playground / on latest rust version, so it's working splendidly now.

At the time, I definitely somehow got a lint which told me I had to add where Self: 'r / directed me to this page - and I believe that adding this bound then broke my code - but it could be that (A) I'm mis-remembering something and my code did reference the lifetime of &self somehow in one iteration which led me here, and then I got confused by the initial comment in the PR, or (B) this behaviour has got more clever in more recent versions of rust.

Either way it's working splendidly now, so apologies for wasting anyones' time - and thanks for all the great work everyone's doing on Rust :)

yshui commented 1 year ago

@jackh726

The problem you ran is because of a bug. Once fixed, the where Self: 'a bound should be fine.

Is there a related issue for this, if this is a bug?

QuineDot commented 1 year ago

90696 and #111852 have programs where eliding the Self: 'a GAT bound works around some bugs.

@yshui these may be the related issue(s) you asked about?

spikespaz commented 10 months ago

I have a problem when I want to use a type with a trait using where Self: 'a in a loop, borrowing mutably.

I have defined the trait as follows:

pub trait Buffered
where
    Self: Iterator,
{
    type ItemSlice<'a>
    where
        Self: 'a;

    /// Consume up to `count` items from the internal iterator, moving them into
    /// the buffer. Return an optional reference to the buffer's items.
    ///
    /// If the iterator did not contain enough items to satisfy `count`, `None`
    /// will be returned. In this case, the only way to get the remaining items
    /// out is by consuming the iterator normally.
    fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>>;
}

Now I use it in two places:

impl<S> Buffered for SourceBytes<S>
where
    S: Iterator<Item = u8>,
{
    type ItemSlice<'a> = &'a [S::Item] where Self: 'a;

    fn buffer(&mut self, count: usize) -> Option<&[S::Item]> {
        if self.buffer.len() < count {
            self.buffer
                .extend(self.iter.by_ref().take(count - self.buffer.len()));
        }
        self.buffer.get(0..count)
    }
}

That one was easy. But now, if I want to have SourceChars (a wrapper on any Iterator<Item = u8> that iterates characters) , I can't use self.0.buffer in a loop because it has already been borrowed in a previous iteration of the loop.

impl<S> Buffered for SourceChars<S>
where
    for<'a> S: Iterator<Item = u8> + Buffered<ItemSlice<'a> = &'a [u8]> + 'a,
{
    type ItemSlice<'a> = &'a str where Self: 'a;

    fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>> {
        for byte in 0.. {
            let buf = self.0.buffer(byte)?;
            if let Ok(slice) = std::str::from_utf8(&buf) {
                if slice.chars().count() >= count {
                    return Some(slice);
                }
            }
        }
        None
    }
}
error[E0499]: cannot borrow `self.0` as mutable more than once at a time
   --> src/parser/iter.rs:111:23
    |
109 |     fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>> {
    |               - let's call the lifetime of this reference `'1`
110 |         for byte in 0.. {
111 |             let buf = self.0.buffer(byte)?;
    |                       ^^^^^^ `self.0` was mutably borrowed here in the previous iteration of the loop
...
114 |                     return Some(slice);
    |                            ----------- returning this value requires that `self.0` is borrowed for `'1`

I'm being told that the real representation I want is impossible with the current syntax for GATs.

Kixunil commented 7 months ago

I struggle to understand why not having the where bound is an issue with GAT but not with this trick I use to simulate GAT in older MSRVs:

trait EncodeTc<'a> {
    type Encoder: Encoder;
}

trait Encode: for<'a> Self: EncodeTc<'a> {
    fn encoder(&self) -> <Self as EncodeTc<'_>>::Encoder;
}

I don't see anything telling the compiler that Self: 'a yet AFAICT this works perfectly fine. I've even implemented a macro that can use GAT syntax to impl the trait (so that once MSRV bumps it'll be a simple change). Is there a problem lurking there that I will hit eventually? Also why did the GAT work take so much time if it looks like it could be just a syntax sugar for what I wrote? I must be missing something.

QuineDot commented 5 months ago

@Kixunil There are situations where your pattern fails because the implementation of EncodeTc<'a> requires Self: 'a.

For example:

impl Encoder for &&str {}
impl<'a, 'b> EncodeTc<'a> for &'b str {
    type Encoder = &'a &'b str;
}

Can't work without the bound:

error[E0491]: in type `&'a &'b str`, reference has a longer lifetime than the data it references
  --> src/lib.rs:13:20
   |
13 |     type Encoder = &'a &'b str;
   |                    ^^^^^^^^^^^

So you add the bound...

// This part now compiles due to `'b: 'a`
impl<'a, 'b: 'a> EncodeTc<'a> for &'b str {
    type Encoder = &'a &'b str;
}

...but now the HRTB can't hold due to the extra constraint (which is not present in the HRTB).

impl<'b> Encode for &'b str {
    fn encoder(&self) -> <Self as EncodeTc<'_>>::Encoder {
        self
    }
}

(intermediate playground)

error: implementation of `EncodeTc` is not general enough
  --> src/lib.rs:16:21
   |
16 | impl<'b> Encode for &'b str {
   |                     ^^^^^^^ implementation of `EncodeTc` is not general enough
   |
   = note: `EncodeTc<'0>` would have to be implemented for the type `&'b str`, for any lifetime `'0`...
   = note: ...but `EncodeTc<'_>` is actually implemented for the type `&'1 str`, for some specific lifetime `'1`

You can work around this like so:

// Default type parameter with an implied `Self: 'a` bound
//               vvvvvvvvvvvvvvvvvvvvvvvvvvv
trait EncodeTc<'a, _LifetimeBound = &'a Self> {
    type Encoder: Encoder;
}

//   vvvvvv  Explicit `'b: 'a` bound no longer required
impl<'a, 'b> EncodeTc<'a> for &'b str {
    type Encoder = &'a &'b str;
}

And now the HRTB can be met, as the HRTB honors the implied bound.


The implied bound in the default type parameter is telling the compiler that Self: 'a. With GATs, you use a where clause on the GAT instead.