rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.97k stars 1.57k forks source link

`#[derive(Default)]` on enum variants with fields #3683

Open estebank opened 3 months ago

estebank commented 3 months ago

Support the following:

#[derive(Default)]
enum Foo {
    #[default]
    Bar {
        x: Option<i32>,
        y: Option<i32>,
    },
    Baz,
}

Rendered

estebank commented 3 months ago

Follow up to https://github.com/rust-lang/rfcs/pull/3107. Noticed as an easy gap to fill during implementation of https://github.com/rust-lang/rfcs/pull/3681.

Lokathor commented 3 months ago

Under drawbacks, I'd argue that this actually reduces complexity of the standard library.

People know about derive(Default) giving some sort of basic Default impl. When there's less special cases when we have to say "except it won't work here, here, and here", that means there's less complexity in the standard library.

estebank commented 3 months ago

@Lokathor I'm unsure if I'm reading you right, but that sounds like a positive to me 😄

Lokathor commented 3 months ago

Yes, exactly my point. The RFC text says this is a drawback, to have increased complexity. That might be true from a compiler internals perspective, or whatever. However, from an end user perspective, things that "just work" without needing exceptions listed for when they won't work in rare cases, those things are less complex.

So, in other words, I'd delete that part of the Drawbacks section. This doesn't increase complexity from a user's perspective.

Lokathor commented 3 months ago

@jplatte that's listed in the Drawbacks section i think, could you be more specific about what you expect the RFC to say about generics?

jplatte commented 3 months ago

I was wondering what the generated bounds would be on a generic enum that uses this feature.

Currently, #[derive(Default)] on a generic enum never emits any bounds: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=efec38bf6b014835f97773e9cc30067c

Let's consider this

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

I guess there wouldn't be any bounds on U, otherwise that would be inconsistent with the existing behavior. But would there be a T: Default bound, or an Option<T>: Default bound? I think the latter could be justified with enums not supporting private fields, unlike structs.

Lokathor commented 3 months ago

There would have to be a bound on Option<T> being default, because the expansion is to have the code be Foo{ field: Default::default() }, and field has the Option<T> type. That's the minimum amount of bounding.

It's possible that additional bounds might end up in the mix, because we don't always have perfect derives, as mentioned in the Drawbacks section. For example, the "current behavior" could arguably only apply to unit variants of enums being the default, and field-having variants would (accidentally) drag in all the generics.

jplatte commented 3 months ago

There would have to be a bound on Option<T> being default, because the expansion is to have the code be Foo{ field: Default::default() }

I don't think there would have to be. The more "obvious" non-perfect-derive expansion would be T: Default, just like for structs.

Lokathor commented 3 months ago

Well, T being Default has nothing really to do with if Option<T>, the actual type of the field, is also Default.

Or to be more general, if we have a field type using a generic, F<T> say, then we only care that F<T> is Default, not T itself.

And the Proc Macro doesn't know which case it is when it generating the impl. So, just generating a bound based on just T itself is almost fundamentally incorrect, and will generate bad impls in extremely trivial cases.

jplatte commented 3 months ago

Or to be more general, if we have a field type using a generic, F<T> say, then we only care that F<T> is Default, not T itself.

(emph mine)

I know all about this, but the "we only care" statement does not apply to any derives in std currently. That's why I think this RFC should make an explicit choice as to whether the Default derive should care (for enums) and emit bounds like [T; 3]: Default when a field of type [T; 3] exists in the enum variant, breaking consistency with existing derive logic but likely being useful in more places.

Might also be worth discussing potential interaction with private enum fields (not sure whether there's an RFC), as IIRC one of the things perfect derive has been criticized for is that it adds more cases of internal changes (adding a private field to a struct) possibly leaking out into public interfaces (bounds).

estebank commented 3 months ago

One thing I realized while I was out is that https://github.com/rust-lang/rfcs/pull/3681 would allow us to side-step the lack of perfect derive. The only way of doing perfect derive is with type system access. The other two alternatives are to add a way to specify bounds with an attribute, or force the struct to specify the bounds, which would make them more restrictive than necessary. But with default field values, if a field is given an explicit default then Default::default() on it doesn't need to be called. Which means that we can determine, purely with the AST, which type parameters might need the Default bound. So you could write

#[derive(Default)]
struct S<T> {
    a: Option<T> = None,
}

and the expansion would need no bounds on T. So that RFC would reduce the pressure to get perfect derives for a bit. This of course doesn't need to be part of that RFC, it can be "future work" because it would purely make more code compile.

jplatte commented 3 months ago

The only way of doing perfect derive is with type system access.

How so? You could generate where Option<T>: Default as a bound if that's the field type, no?

But with default field values, if a field is given an explicit default then Default::default() on it doesn't need to be called

Unfortunately this doesn't seem true to me. I would actually argue that #3681 makes things more complicated: Imagine if Option had a const fn default_some() -> Self where T: Default associated fn that returns Some(T::default()). There would be no way for the Default derive to know that field: Option<T> = Option::default_some() requires the bound. I guess for the = None case, things are indeed simpler though because that's not a function call so it can't have any bounds of its own.

estebank commented 3 months ago

How so? You could generate where Option<T>: Default as a bound if that's the field type, no?

The only issue that I could think of when doing that blindly is the case of a recursive type, like the following case:

#[derive(Clone)]
struct A<T> {
    x: Option<Box<A<T>>,
}

Currently the expansion in that case is

impl<T: Clone> Clone for A<T> {
    fn clone(&self) -> A<T> {
        A {
            self.x.clone(),
        }
    }
}

The naïve expansion of this with an attempt at perfect derives could be

impl<T> Clone for A<T> where A<T>: Clone {
    fn clone(&self) -> A<T> {
        A {
            self.x.clone(),
        }
    }
}

which would be accepted by the compiler, but never allow anyone to call <A<T> as Clone>::clone for any T. A simple side-step might be to identify that A is used in any of the fields and then revert to the current behavior of imperfect derives. It would make #[derive(X)] imperfect for any recursive type (even if unneeded), but allow for more accurate derives in every other case.

Imagine if Option had a const fn default_some() -> Self where T: Default

FWIW, that still requires Default to be const, which it isn't now, but point taken. I need to think more about that case, but I believe that the obligations are actually correctly propagated: you couldn't have an expression that has an implicit transitive bound that references a type parameter without that type parameter having that bound in the item definition.

Lokathor commented 3 months ago

I'm not sure that the fact that the derive might do a weird thing in a small fraction of cases is even particularly a problem, because people are never required to use the derive for Default.

Slightly separately i also think less cases would be weird with Default than with the Clone example you give.

Lokathor commented 3 months ago

@ehuss I just noticed that you tagged this T-lang, but isn't this more like a T-libs-api concern?

ehuss commented 3 months ago

Attributes are handled by the lang team. I added t-libs-api (similar to #3107). In general, teams should feel welcome to add themselves or other teams (and remove themselves if they want). I also expect teams to exercise good judgement in case other teams should be involved.

estebank commented 2 months ago

After a quick conversation with niko, he mentioned that the general version of perfect derives is blocked on resolving the cycle case when it is indirect:

struct A<T> {
    x: Option<Box<B<T>>,
}
struct B<T> {
    x: Option<Box<A<T>>,
}
impl<T> Clone for A<T> where B<T>: Clone {
    fn clone(&self) -> A<T> {
        A {
            self.x.clone(),
        }
    }
}
impl<T> Clone for B<T> where A<T>: Clone {
    fn clone(&self) -> B<T> {
        B {
            self.x.clone(),
        }
    }
}

This case is not detectable purely from the token tree/AST. @compiler-errors you're looking at the trait resolution side of this, right?

compiler-errors commented 2 months ago

@estebank: Not sure why I got a ping directly lol, but this is blocked on coinduction which is blocked on the new trait solver.

matthieu-m commented 2 months ago

As mentioned by @Lokathor above, reducing the number of special-cases is generally a win with regard to complexity (for the end-user).

Following on this, I would suggest moving on with the imperfect derive, as a struct would. Or any other derive than Default on an enum. Specifically, I would expand:

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

to

impl<T: Default, U: Default> Default for MyEnum<T, U> {
    fn default() -> Self {
        Self::Foo { field: Default::default() }
    }
}

It's very much imperfect:

But it's consistent, and while consistency may be the hobgoblin of little minds, it does make it easier to learn and use the language.

It's simple enough to write the implementation manually in the few cases where it matters, and it's not like we all aren't used to it. A later RFC/feature can take care of handling smarter derives for all.

Deriving `Clone` on enums. ```rust #[derive(Clone)] pub enum Enum { Foo(PhantomData), } ``` Expands to: ```rust pub enum Enum { Foo(PhantomData), } #[automatically_derived] impl ::core::clone::Clone for Enum { #[inline] fn clone(&self) -> Enum { match self { Enum::Foo(__self_0) => Enum::Foo(::core::clone::Clone::clone(__self_0)), } } } ``` When `PhantomData` is cloneable regardless of whether its generic type is. That's how it is...
jplatte commented 2 months ago

@matthieu-m It wouldn't be consistent with how #[derive(Default)] works on generic enums right now. I think the least surprising / most consistent option would rather be to add Default bounds for all generic parameters that are mentioned within the #[default] variant's field types.

Lokathor commented 2 months ago

Just for reference, and to expand on what jplatte said, when you have a unit variant as the default you get a default impl that doesn't depend on all the generics being Default, because the generics aren't fields of the unit type. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=be6e8ae45184fddb99d916ec6a237a25

So yes, a default bound on all generics regardless of where they're used is definitely too much bounding, and would not be consistent.

But again, I don't think we need a bound on the generics used by the default variant, what we would need is a bound on the exact field types used in the variant. The way that the generic is wrapped in other types is important. i32 is Default, but &i32 is not, so if a field is &T then a bound on T: Default is the wrong bound.

matthieu-m commented 2 months ago

@matthieu-m It wouldn't be consistent with how #[derive(Default)] works on generic enums right now. I think the least surprising / most consistent option would rather be to add Default bounds for all generic parameters that are mentioned within the #[default] variant's field types.

It's another possibility yes.

It's inconsistent with the derivation of every other trait, but no other trait has a #[default] annotation on a specific variant, so maybe it's good enough.

RalfJung commented 2 months ago

@estebank the enum in the PR description has no #[derive(Default)]. ;)

RalfJung commented 2 months ago

But again, I don't think we need a bound on the generics used by the default variant, what we would need is a bound on the exact field types used in the variant. The way that the generic is wrapped in other types is important. i32 is Default, but &i32 is not, so if a field is &T then a bound on T: Default is the wrong bound.

That's true, but it's not how our derives work, and this was discussed above.

I would find it quite surprising if we had two fundamentally different ways to generate code for derives of "things with fields". If we have a way to generate better derives (adding bounds on the fields, not the generic types), we should do that consistently.

The "enum variant without field" case is fairly harmless, but once there are arbitrary fields this is just going to look like more special cases, that work better in some situations but worse in others.

PoignardAzur commented 1 month ago

But with default field values, if a field is given an explicit default then Default::default() on it doesn't need to be called. Which means that we can determine, purely with the AST, which type parameters might need the Default bound.

Note that we have a discussion right now in #3681 about whether this is even possible.

Also, I think people are assuming that only the fields in the default branch would generate a default bound, but existing derive macros have no mechanism for being that specific.

That is:

Let's consider this

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

I guess there wouldn't be any bounds on U, otherwise that would be inconsistent with the existing behavior. But would there be a T: Default bound, or an Option<T>: Default bound?

You guessed wrong. It wouldn't be either of those, it would be T: Default, U: Default. (See discussion in #3681 for details.)

I think this distinction should be made clear in the RFC, because I think a lot of people would make the same assumption as the one in the quote.