rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.96k stars 1.57k forks source link

[RFC] Default field values #3681

Closed estebank closed 2 weeks ago

estebank commented 3 months ago

Allow struct definitions to provide default values for individual fields and thereby allowing those to be omitted from initializers. When deriving Default, the provided values will then be used. For example:

#[derive(Default)]
struct Pet {
    name: Option<String>, // impl Default for Pet will use Default::default() for name
    age: i128 = 42, // impl Default for Pet will use the literal 42 for age
}

Rendered

Tracking:

orangecms commented 3 months ago

This is excellent! I'm sometimes dealing with binary data structures, mostly vendor specific headers, and so far felt like defaults / fixed values caused quite some duplication. This approach reduces that, thank you! 👏

estebank commented 2 months ago

I believe I've addressed most of the comments (and have collapsed conversations to make way for new ones). Feel free to add additional comments for anything that might be missing.

As a recurring thing: I believe we need to target the minimal version of this feature that does not block future expansion.

Whether building Config { .. } where Config has private fields is allowed one that needs to be defined somewhat one way or another. If it is allowed by default, then the interaction with #[non_exhaustive] needs to be defined. If it is allowed as opt-in, then the conversation can be delayed.

I strongly believe that allowing non-const default values, at least at start, is not a good idea, while it can always be extended later. I'm also looking forward for ~const Default to be a thing in the std.

Whether enum struct variant support should be included in the first version or not is one I flip-flop on myself: currently we can only #[default] unit variants, so the either we add the literal Enum::Variant { .. } support without expanded #[derive(Default)] support, extend the #[derive(Default)]/#[default] support to work only if all fields are defaulted, extend the support to work with Default::default() when possible with imperfect derives, or we withhold support for Enum::Variant { .. } until this gets resolved. None of the options are ideal, as they will all end up with a special case for general rules, one way or another.

I have a pretty functional implementation of the feature (with some of the above questions answered in the way that made it easiest to implement quickly: only const expressions are allowed, private fields can't be constructed and with no opt-in, enum variants can be defaulted and built) in https://github.com/rust-lang/rust/pull/129514, and it is already more functional than I imagined it would be.

jhpratt commented 2 months ago

Obligatory link to pre-RFC from December 2021, which was itself taken (in part) from a draft RFC from December 2018.

There was a fair amount of discussion for the former, with a smaller amount on the latter. Naturally quite a bit has changed in the ensuing six years, but the discussions remain relevant given the drafting history here.

matthieu-m commented 2 months ago

I'm afraid the section on Imperfect derives is too optimistic.

I suppose it makes sense intuitively that None doesn't require T: Default, but what about an arbitrary expression?

This RFC mentions #[serde(default)] as something that could eventually be removed (yes!), however #[serde(default)] supports much more than literals & identifiers: it supports arbitrary function calls.

And arbitrary function calls may (1) infer their generic arguments, so they are invisible at the syntax level and (2) and may impose requirements on those generic arguments.

For a "regular" use of this feature, the fact that arbitrary invisible requirements exist is a non-issue:

However I'm afraid for derive it's a tad more difficult...

To make things concrete:

pub struct NonEmptyVec<T> { ... }

impl<T> NonEmptyVec<T> {
    pub const fn default_singleton() -> NonEmptyVec<T> 
    where
        T: ~const Default,
    { ... }
}

#[derive(Default)]
pub struct MyThing<T> {
    foos: Vec<T> = Vec::new(),
    bars: NonEmptyVec<T> = NonEmptyVec::default_singleton(),
}

I don't think that #[derive(Default)] can assume that Vec::new() and NonEmptyVec::default_singleton() DO NOT require T: Default. And in fact, the latter does.

FlixCoder commented 2 months ago

At first glance, I thought this is a bad idea: we suddenly have the possibility to specify field defaults, that are different from the the Default trait implementation. This can be quite confusing. So MyStruct { .. } != MyStruct { ..Default::default() }.

But there is a really big benefit: being able to require mandatory fields, while still providing defaults to the other fields. This removes a LOT of derived builders with their own duplicate default definition for actually just data records without any logic. So I ended up with huge compile time costs and less ergonomy, because builders check existence of required fields at runtime and return errors (unless using compile time builders, with even more compile time cost).

So overall I really like this proposal!

scottmcm commented 2 months ago

I've wanted something like this for ages. I took a fresh read through here after the updates, and other than what looks like a vestigial section, it looks good to go to me! Let's see what the rest of the team thinks.

@rfcbot fcp merge

rfcbot commented 2 months ago

Team member @scottmcm has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. See this document for info about what commands tagged team members can give me.

tmandry commented 2 months ago

Overall I'm a huge fan of this RFC. I would like to see more on the treatment of #[non_exhaustive] and private fields, so let me provide a use case that can act as a bit of motivation.

I don't think we can realistically "punt" here: Disallowing use of Foo { .. } outside a module is very important for typestate patterns, and therefore any way of doing that will need to continue being supported in the future.

Our use case: The repr(C) options struct

Let's say we want to have a struct like the following.

#[repr(C)]
#[non_exhaustive]
pub struct zx_packet_page_request_t {
    pub command: u16,
    pub flags: u16 = 0,
    reserved0: u32 = 0,
    pub offset: u64,
    pub length: u64,
    reserved1: u64 = 0,
}

Here we have a repr(C) struct whose layout cannot change, but which might add fields that use up the space in reserved0 and reserved1 in the future. We are constrained in the future evolution that setting the bytes that correspond to reserved0 and reserved1 to zero must retain the same meaning as today.

It would not be appropriate to specify defaults for all fields, but we can specify defaults for some of them. The struct is #[non_exhaustive] because we might add more field names in the future, and any places that list all field names will need to account for this with ... It also has private fields because the names reserved0 and reserved1 might disappear (or change types) in the future.

We would like users to be able to specify a value of this struct like this: zx_packet_page_request_t { command: ZX_PAGER_VMO_READ, offset: 0, length: 128, .. }. This would allow for evolving the struct without introducing any source breakage.

Option 0: Don't support .. on structs with #[non_exhaustive] or private fields

As the RFC states today, we can choose not to support this (outside the defining module).

If we stopped here it would be a disappointing outcome, but this is forward-compatible with several of the other options here.

Option 1: Opt-in for #[non_exhaustive], always allow for private fields

For reasons brought up by @scottmcm, we probably can't allow it directly for #[non_exhaustive], so we would need a way to opt in anyway with a #[defaultable] attribute or similar. But we could choose to always allow .. initialization for private fields.

The main disadvantage is that users who want to use private fields to prevent construction would not be able to if those private fields are defaulted. They would have to leave off the default or use bare #[non_exhaustive] (without the opt in). The first is an ergonomic loss (needing to list _private: () in each initialization). The second is better and works today, but I would argue it makes the intent less obvious when reading the code.

This option is not backwards compatible with Option 0 in my view, because we should never silently allow structs which where unconstructible outside a module to suddenly become constructible.

Option 2: Opt-in for #[non_exhaustive], but private fields are unsupported

We could say that the use of private fields in this instance would be nice, but ultimately we just don't want to accept that .. can be used to initialize private fields; it would be too out of line with privacy as it is today in Rust. Any use of the "defaultable" opt-in that also had private fields would result in a compiler error.

This restriction mainly impacts the repr(C) use case, which could still be supported by exposing a second builder struct that contained only the public subset of fields. This builder struct would be a defaultable #[non_exhaustive] struct, with a .build() method that creates the actual struct, including setting the field names we don't want to expose. This would continue to be an ergonomic win for users, who get to forward-compatibly use the struct initializer syntax with .., but not for the crate authors who need to create such builders (or use a macro).

This option is backwards compatible with Option 0.

Option 3: Opt-in for both #[non_exhaustive] and private fields

Make a #[defaultable] attribute that opts into allowing .. initialization outside the defining module for both#[non_exhaustive] and private fields. Any non-defaulted private fields would be a compiler error, and adding public non-defaulted fields would be a semver breaking change.

This option is backwards compatible with both Option 0 and Option 2.

Analysis

We have to decide whether to accept or reject Option 1 as part of this RFC. Options 2 and 3 are backward compatible with Option 0, so we could choose to stick with Option 0 as written and reject Option 1.

Recommendation

After writing all this out, I find myself in favor of Options 2 and 3. I think a #[defaultable] attribute should apply equally to private fields and #[non_exhaustive], since those are in many ways equivalent today. It should also error in cases where a struct is not actually defaultable outside the defining module.

Since Option 0 is forward compatible with these I'm okay with the normative RFC text as written, but I would like to see some discussion of this in the future possibilities section if possible (feel free to copy this comment).

tmandry commented 2 months ago

@rfcbot reviewed

I'm okay with const-only initializers to start because we can always choose to relax this restriction later,[^relax] and there are several mitigating possibilities (described below) that can make this feature work better with Default. The only thing that would make me hesitate to accept this RFC is if we decided to do none of these, which I think is unlikely.

[^relax]: I have to keep reminding myself why this is, so this is a note to my future self. If we do relax the const restriction we will allow more types to be used with .. initializers, but only in non-const contexts. This won't break any existing usages in const contexts, of course, since those types were always const to begin with. For struct authors it does present the possibility of accidentally breaking uses of .. by adding non-const initializer exprs, which we could mitigate with a lint or by requiring more explicit syntax like braces. \ \ In the cases where .. will invoke #[derive(Default)], we would of course have to allow it without a lint on the struct. This is fine; there's a slight chance of semver breakage in going from all = constexpr to #[derive(Default)] but that could be easily fixed with #[derive(const Default)]. We can allow using .. which invokes Default in non-const contexts, and optionally lint on uses which invoke a manual non-const impl, since as the argument goes, people won't expect Foo { .. } to have side effects.

There are several future possibilities I would like to see mentioned in the RFC. But this is a non-normative section and I don't want to block this RFC from going forward.

programmerjake commented 2 months ago

https://github.com/rust-lang/rfcs/pull/3681#issuecomment-2359685895

After writing all this out, I find myself in favor of Options 2 and 3.

I agree, though tbh I prefer Option 3 over Option 2.

scottmcm commented 2 months ago

@tmandry I agree I'd like to move to something like option 3 (in the future, not here), though I've been spelling it differently in terms of how to opt-in, since we've talked about other targeted opt-outs of other parts of non_exhaustive too.

RalfJung commented 2 months ago

For reasons brought up by @scottmcm, we probably can't allow it directly for #[non_exhaustive]

What are those reasons? A link would have been nice. :)

At first sight, it is not clear to me why "all private fields have a default" is not sufficient to allow Foo { fields, .. }, assuming all non-default fields are listed. #[non_exhaustive] would then promise that all fields added in the future have a default. Or is it that promise that the opt-in is about?

scottmcm commented 2 months ago

@RalfJung I'd guess that's referring to https://github.com/rust-lang/rfcs/pull/3681#discussion_r1728060796

traviscross commented 1 month ago

@rustbot labels +I-lang-nominated

I too want something like this. But I'm not sure yet whether I want exactly this. Mostly I have some reservations about the inequivalence between let x = Foo { .. } and let x = Foo { ..Foo::default() }, the implications of that inequivalence on how people would structure code, and how and whether we could ever close that gap.

Maybe it'd help to see some experimentation here. I'd be interested in trying this out on nightly under an experimental feature flag. I see that there's a draft PR over in:

Or maybe it'd be enough to talk this through. I've proposed:

estebank commented 1 month ago

I have some reservations about the inequivalence between let x = Foo { .. } and let x = Foo { ..Foo::default() }, the implications of that inequivalence on how people would structure code, and how and whether we could ever close that gap.

As part of stabilization I would want to have a lint warning against manual impl Default for Foo if there are default fields, which should mitigate the potential for divergence in behavior. I would expect that default fields will be used either to preclude the need of manual impl or for types with mandatory fields, where Default isn't suitable. I struggle to think of use-cases that don't fall in either of those buckets.

scottmcm commented 1 month ago

+1 to linting -- I think we can also do things like "hey, you put defaults for all the fields, so you should probably derive(Default) too" (at least in clippy/ra, if maybe not in rustc)

traviscross commented 1 month ago

Perhaps the most space saving would be to prevent any impl of Default (even derived) for a type with default fields but not defaults for all fields, and then to permit only (and suggest) a derived Default impl for types with defaults for all fields.

CraftSpider commented 1 month ago

That first part sounds very annoying, as a user - I often manually override Default just to change one or two fields, but now I have to add = 0 (or an even more complex expression) to every other field anyways. Which is noisy and, at least to me, feels like a very artificial restriction for a situation that has an incredibly obvious expected behavior. I'd expect at least deriving to work.

traviscross commented 1 month ago

...but now I have to add = 0... to every other field anyways. Which is noisy...

The trouble is that if you don't add those = 0s to all fields, then the Foo { .. } syntax in this RFC doesn't work, so regardless of any hypothetical restrictions, this RFC would push people to do that.

That's one of the things we're trying to sort out.

scottmcm commented 1 month ago

Spitballing: what if Foo { .. } default-initialized even the things that don't have defaults? (Basically, assuming we have a const Default of some sort, it puts in Foo { a: const { Default::default() }, … }.)

I guess the problem that has is that you can just put Options { .. } even if there's a "required" field, and there'd be nothing to nudge you back to actually setting those fields. (Because if you got a lint if they didn't have defaults then the library author would still need to add the = 0s.)

Unsure how I feel about it. Certainly seems kinda weird.

On the other hand, if you want the compiler to tell you what fields exist, you could just write Options {} instead and it'll tell you about stuff. And auto-inserting zeroes is not really any worse than if someone just writes in those values themselves, so the code needs to check for those values anyway.

And there's always the option of doing it with some opt-in subtrait of Default instead, which would be a way for types to say that while they're Default as a convenience, they're not really meaningfully default. One version of that could even be "well it's only Option that implements it", so that you don't need to = None; manually on those. (I guess also PhantomData and some others like that too, but a very restricted set that wouldn't include &str and such.)

traviscross commented 1 month ago

I guess the problem that has is that you can just put Options { .. } even if there's a "required" field

In that world we could have some way -- an attribute, syntax, something -- to indicate that a field is required to be set even if its type implements Default. That is, rather than that being implicit by it not having = 0, we could explicitly indicate requiredness.

mhnap commented 1 month ago

The trouble is that if you don't add those = 0s to all fields, then the Foo { .. } syntax in this RFC doesn't work, so regardless of any hypothetical restrictions, this RFC would push people to do that.

As a user, I find this to be expected behavior. It feels logical that Foo { .. } syntax can work only if there are provided default values for all fields. At the same time, Default can be used as always. So, something like this would be possible:

#[derive(Default)]
pub struct Foo {
    pub alpha: &'static str,
    pub beta: bool,
    pub gamma: i32 = 42,
}

// This impl would be generated from derive
impl Default for Foo {
    fn default() -> Self {
        Self { alpha: Default::default(), beta: Default::default(), gamma: 42 }
        // or
        Self { alpha: Default::default(), beta: Default::default(), .. } // gamma: 42
    }
}

let f = Foo { .. }; // rejected
let f = Foo { alpha: "", beta: false, .. }; // ok, gamma: 42
let f = Foo { ..Default::default() }; // ok
let f = Foo::default(); // ok
tmandry commented 1 month ago

The trouble is that if you don't add those = 0s to all fields, then the Foo { .. } syntax in this RFC doesn't work, so regardless of any hypothetical restrictions, this RFC would push people to do that.

That's one of the things we're trying to sort out.

I talk about this in my above comment too. I see a couple ways of unifying them that are backward-compatible with the RFC as written:

I think where I land is:

tmandry commented 1 month ago

Spitballing: what if Foo { .. } default-initialized even the things that don't have defaults? (Basically, assuming we have a const Default of some sort, it puts in Foo { a: const { Default::default() }, … }.)

I don't think I would want it to invoke the default constructor of the field type. What we are trying to unify here is Struct { .. } with Struct::default(). A struct without a default impl might lack that impl for a reason: While the individual field types might all be defaultable, it is entirely possible for their combined defaults to be an incoherent value of the struct.

In that world we could have some way -- an attribute, syntax, something -- to indicate that a field is required to be set even if its type implements Default. That is, rather than that being implicit by it not having = 0, we could explicitly indicate requiredness.

That is an interesting idea! But again, I don't think we should make all structs implicitly default-constructible just because their fields are all default-constructible. So to have an opt-out, we would also have to have an opt-in. Maybe something like:

struct Foo {
  x: i32,
  #[required] y: String,
  .. = Default::default()
}
traviscross commented 1 month ago

I see a couple ways of unifying them that are backward-compatible with the RFC as written:

  • ...
  • Relax the const restriction, and make Struct { .. } do the same thing as Struct { ..Default::default() } (modulo field privacy, which is an aspect of FRU I do not want to retain)

Unfortunately, I don't think this is backward-compatible.[^1] Under the RFC, S { .. } and S::default() can return arbitrarily different things. That's why I mused:

Perhaps the most space saving would be to prevent any impl of Default (even derived) for a type with default fields but not defaults for all fields, and then to permit only (and suggest) a derived Default impl for types with defaults for all fields.

...as that restriction would save this space in both directions by preventing S::default() from working when S { .. } does not and ensuring that S { .. } and S::default() return the same value when both work.

[^1]: I read the footnote in the comment above about why that should be true, but I can't reconcile it.

traviscross commented 1 month ago

Regarding the outstanding analysis that, @tmandry, you did here on the interaction of this with non_exhaustive and field privacy... I might prefer to see us at least try to solve this along with the rest of the design and while we have this loaded into cache. (Landing this in nightly under an experimental feature flag might help here too.)

Perhaps it'd also be useful for the RFC to integrate a description of the use case you mention and ensure it captures the substance of your analysis, though I note that it does already have its own extensive analysis.

tmandry commented 1 month ago

Unfortunately, I don't think this is backward-compatible.1 Under the RFC, S { .. } and S::default() can return arbitrarily different things.

I wasn't trying to solve that, so in my proposal a: i32 = 17 would take precedence over impl Default on the same struct when S { .. } is used. The Default impl would only be used for fields that have no explicit default.

I agree it's a bit messy to let these diverge, but I think a lint is an acceptable way of addressing that, possibly going so far as to deny-by-default when there are no valid use cases left.

While it would be nice not to allow divergence at all, I wouldn't hold back a substantial part of the benefit of this RFC for that.

lolbinarycat commented 1 month ago

why not just forbid manually implementing Default on structs with field defaults? that gets around the nasty ..Default::default() parity issue, and is a restriction that can be relaxed in the future.

also, an advantage the rfc does not list is that these defaults will presumably show up in rustdoc-generated documentation, while a manual implementation of Default will not, often requiring repetition of those defaults in doc comments, and those comments are liable to get out of sync with the actual default.

lolbinarycat commented 1 month ago

I wasn't trying to solve that, so in my proposal a: i32 = 17 would take precedence over impl Default on the same struct when S { .. } is used. The Default impl would only be used for fields that have no explicit default.

what happens in this case:

struct Weird {
  a: u8 = 1,
  b: u8,
}

impl Default for Weird {
  fn default() -> Self {
    Self {
      a: 2,
      b: 3,
    }
  }
}

what does dbg!(Weird{ .. }) print? does it use the u8 Default or the Weird one?

skogseth commented 1 month ago

I wasn't trying to solve that, so in my proposal a: i32 = 17 would take precedence over impl Default on the same struct when S { .. } is used. The Default impl would only be used for fields that have no explicit default.

what happens in this case:

struct Weird {
  a: u8 = 1,
  b: u8,
}

impl Default for Weird {
  fn default() -> Self {
    Self {
      a: 2,
      b: 3,
    }
  }
}

what does dbg!(Weird{ .. }) print? does it use the u8 Default or the Weird one?

This would give a compilation error saying that you need to specify the value for b, as it doesn't have a default field value specified. Nevermind, I think I misunderstood what you were responding to.

estebank commented 1 month ago

what does dbg!(Weird{ .. }) print? does it use the u8 Default or the Weird one?

@lolbinarycat under the proposal that you're replying to, it would be using the impl Default for u8.

I do not think that allowing Weird { .. } should be part of this RFC, at most an opt-in future change.

PoignardAzur commented 1 month ago

Something I haven't seen addressed: does the following code work?

#[derive(Default)]
struct Foobar<T> {
    foo: [T; 10] = Default::default(),
    bar: Option<T> = Default::default(),
}

Because if so, that paragraph from the RFC doesn't really hold:

One thing to notice, is that taking default values into consideration during the desugaring of #[derive(Default)] would allow to side-step the issue of our lack of perfect derives, by making the desugaring syntactically check which type parameters correspond to fields that don't have a default field, as in the expansion they will use the default value instead of Default::default().

traviscross commented 1 month ago

Perhaps we're trying to fill out a table that looks something like this:

On ADT: no Default impl #[derive(Default)] impl Default for
= val on all fields
= val on some fields[^1]
= val on some fields[^2]
= val on some fields[^3]
= val on some fields[^4]
= val on some fields[^5]
= val on no fields
#[required] on some fields[^0]

In each square are the answer to questions such as:

[^0]: If that were a thing. [^1]: All fields without = val are of a type that itself can be constructed with { .. }. [^2]: All fields without = val are of a type that itself can be constructed with <_ as const Default>::default()[^0]. [^3]: All fields without = val are of a type that itself can be constructed with <_ as Default>::default(). [^4]: All fields without = val are of a type that itself can be constructed with all or some non-trivial subset of the above. [^5]: All fields without = val are of a type that itself does not have any kind of default.

estebank commented 1 month ago

@PoignardAzur because of the way the expansion would look, you would get the following error on the struct definition:

error[E0277]: the trait bound `T: Default` is not satisfied
 --> src/lib.rs:9:18
  |
9 |             foo: Default::default(),
  |                  ^^^^^^^^^^^^^^^^^^ the trait `Default` is not implemented for `T`, which is required by `[T; 10]: Default`
  |
  = note: required for `[T; 10]` to implement `Default`
help: consider restricting type parameter `T`
  |
6 | impl<T: std::default::Default> Default for Foobar<T> {
  |       +++++++++++++++++++++++

forcing you to write

#[derive(Default)]
struct Foobar<T: Default> {
    foo: [T; 10] = Default::default(),
    bar: Option<T> = Default::default(),
}

This is predicated, of course, on const Default becoming a thing, but it would still apply if you had a const fn option_default<T>() -> Option<T> { .. }.

PoignardAzur commented 1 month ago

Actually, thinking about it, you would probably get an error even if Default isn't derived, right? Otherwise Foobar { .. } would resolve to an invalid expression if T doesn't implement Default.

Which means every default value must be valid with the type's default bounds, which means a field with a default value cannot possibly introduce new Default bounds, which means ignoring it when generating where clauses in #[derive(Default)] is fine.

tmandry commented 1 month ago

why not just forbid manually implementing Default on structs with field defaults? that gets around the nasty ..Default::default() parity issue, and is a restriction that can be relaxed in the future.

Erm, yes, obviously we should do that @lolbinarycat :). I like it.

Regarding the outstanding analysis that, @tmandry, you did here on the interaction of this with non_exhaustive and field privacy... I might prefer to see us at least try to solve this along with the rest of the design and while we have this loaded into cache. (Landing this in nightly under an experimental feature flag might help here too.)

I tend to agree that we should settle on Option 2 and probably Option 3, and that these can be stabilized one at a time and after the rest of the RFC, if desired. Do you have reservations about this @estebank? I am mindful of scope creep but my feeling is that the interaction is quite important and within the scope of the RFC.

The remaining question is what to call the opt-in, which is less important than settling on the semantics we're going for. If the lang team meets about this RFC we can accelerate the bikeshed painting, or we can leave it as an open question in the RFC.

scottmcm commented 1 month ago

On Foo { .. } vs Foo::default():

I think it's fine if they're not forced to be the same so long as the easy ways to do things are all consistent. For example, so long as it's easier than getting Hash and PartialEq consistent, I think it's fine. (We can even add a "MUST" to the Default trait's docs about that; it just won't be an unsafe promise. Just like PartialEq and PartialOrd consistency.)

So realistically, what does it need to get things inconsistent? It needs all of

  1. A field with a default in the type
  2. A manually-implemented Default
  3. Specifying the field in the initializer unnecessarily

I don't think that banning (2) is acceptable (cc @tmandry because of their post above), because we don't have perfect derive. I really don't want to have the outcome here be that you're not allowed to write something like the following:

struct Foo<T> { a: u32 = 4, b: Option<T> = None }
impl<T> Default for Foo<T> { fn default() -> Self { Self { .. } }

because that seems entirely reasonable and useful.

So what if we target (3) instead? A lint for "hey, in the Default::default implementation you're specifying a field that has a default in the struct definition (and which doesn't match) -- you probably don't want to do that".

I don't think we need to ban Weird as a hard error. A pit of success is more than enough.

(With the lint described, this is already substantially easier to get right than PartialEq/PartialOrd, as seen from threads like https://users.rust-lang.org/t/panic-on-sort-of-f32-s-with-f32-total-cmp-in-1-81-0/117675?u=scottmcm.)

traviscross commented 1 month ago

I don't think that banning (2) is acceptable... because we don't have perfect derive. I really don't want to have the outcome here be that you're not allowed to write something like the following:

struct Foo<T> { a: u32 = 4, b: Option<T> = None }
impl<T> Default for Foo<T> { fn default() -> Self { Self { .. } }

because that seems entirely reasonable and useful.

Maybe that's what #[derive(Default)] should mean when all fields have syntactic defaults? The RFC suggests as much in the motivation here:

One thing to notice, is that taking default values into consideration during the desugaring of #[derive(Default)] would allow to side-step the issue of our lack of [perfect derives], by making the desugaring syntactically check which type parameters correspond to fields that don't have a default field, as in the expansion they will use the default value instead of Default::default(). By doing this a user can side-step the introduction of unnecessary bounds by specifying a default value of the same return value of Default::default().

It specifically suggests the expansion of...

#[derive(Default)]
struct Foo<T> {
    bar: Option<T> = None,
}

...to:

struct Foo<T> {
    bar: Option<T>,
}
impl<T> Default for Foo<T> {
    fn default() -> Foo<T> {
        Foo {
            bar: None,
        }
    }
}

(That would of course have some SemVer-relevant effects worth considering.)

scottmcm commented 1 month ago

by making the desugaring syntactically check which type parameters correspond to fields that don't have a default field

It's really non-obvious to me whether that's even possible. At the very least it's got to be quite nuanced in cases where there are multiple type parameters with potential trait bounds between them, such that you can still get to a type parameter without ever actually mentioning it in that type or that initialization expression.

And it's hard to say whether a _phantom: PhantomData<T> = PhantomData ought to suppress T: Default. Being inconsistent with every other kind of derive might even be worse than being smarter, if it ends up resulting in extra confusion.

PoignardAzur commented 1 month ago

And it's hard to say whether a _phantom: PhantomData<T> = PhantomData ought to suppress T: Default. Being inconsistent with every other kind of derive might even be worse than being smarter, if it ends up resulting in extra confusion.

The solution for that could be to add opt-out annotations to other traits as well.

You could add a #[marker] annotation to fields, that would let macros know that the field is a ZST. So PartialEq, Eq, PartialOrd, Ord, Hash, and Debug would know to ignore the field. Possibly Clone and Copy too, depending on the semantics of that annotation.

PoignardAzur commented 1 month ago

It's really non-obvious to me whether that's even possible. At the very least it's got to be quite nuanced in cases where there are multiple type parameters with potential trait bounds between them, such that you can still get to a type parameter without ever actually mentioning it in that type or that initialization expression.

This is probably something you should built team consensus on, because it impacts multiple RFCs. For example, we were talking about having a macro generate your where bounds in the context of https://github.com/rust-lang/rfcs/pull/3698.

It would be nice to have a canonical "If your type has these type parameters and these fields, it should generate these bounds" guideline.

scottmcm commented 1 month ago

@PoignardAzur The two things that work well are the "bound all the type parameters" for semver that all the built-in ones use, and the "bound all the field types" approach of so-called perfect derive.

Trying to use tokens to figure out trait usage will always be sketchy at best.

tmandry commented 1 month ago

I was assuming we could make #[derive(Default)] work by suppressing bounds for fields that specify a default, like the RFC says. @scottmcm raises a few issues that are compelling enough to convince me that this isn't trivial (I also dug up this post on the semver hazards associated with perfect derive), and it probably isn't a can of worms we want to chew on in this RFC.

If we can't do that and make manual derives an error with ~no cost, I agree we should use a lint and tailor it as much as possible to make sure no one ever creates divergence accidentally. (I like Scott's suggestion to lint on already-defaulted fields set in the Default impl.)

With that kind of checking it won't be much different from the fact that you can do side-effecting things in a Deref or PartialOrd impl or write bugs there.. you can, but it's obvious that you shouldn't, and in practice no one will other than for weird party tricks.

tmccombs commented 1 month ago

What about the case all fields have a default value, would it be reasonable to allow deriving Default in that case, even if we don't allow it if only some of the fields have defaults.

SOF3 commented 1 month ago

@tmccombs do you mean allowing #[derive(Default)] if fields.all(|field| field.type.implements(Default) || field.default_value.is_some())? I don't understand how that's different from the above

PoignardAzur commented 1 month ago

I'd like to fight back a bit on the "inferring bounds from tokens is intractable" claim, because I do think it's worth considering.

You mentioned associated types in another post. Here's an example of a "tricky" struct using them:

#[derive(Default)]
struct MyStruct<T, U>
where
    T: SomeTrait<U>,
{
    t: T,
    u: T::Assoc,
}

trait SomeTrait<T> {
    type Assoc;
}

impl<T, U> SomeTrait<U> for T {
    type Assoc = U;
}

On the one hand, yes, the U type is "smuggled" in without appearing as a token in the fields. On the other hand, here's what the #[derive(Default)] macro currently expands to:

impl<T: ::core::default::Default, U: ::core::default::Default> ::core::default::Default
    for MyStruct<T, U>
where
    T: SomeTrait<U>,
    T::Assoc: ::core::default::Default,
{
    #[inline]
    fn default() -> MyStruct<T, U> {
        MyStruct {
            t: ::core::default::Default::default(),
            u: ::core::default::Default::default(),
        }
    }
}

The macro already needs to bind on T::Assoc: Default. There's already some non-trivial token parsing going on; this parsing is already somewhat unprincipled (it will generate as bound for T::Assoc, but not for <T as MyTrait>::Assoc), and you can already come up with ways to defeat it today with TAITs and the like.

So the question isn't "Should derive macros generate one bound per argument or should they consider tokens in fields", because they already do the latter. Rather, it's "How complex should the token parsing be?".

Now, granted, existing derive macros do this parsing on top of generating one bound per argument, and I'm suggesting replacing the one-bound-per-argument rule entirely. That does add potential for brittleness.

But I'd argue the difference isn't quite as cut and dry as you think. It's worth examining the possibility further.


Note that this is also relevant to #3683. If you write:

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

You'd like your derive to assume T: Default, not T: Default, U: Default.

tmccombs commented 1 month ago

@SOF3 no I mean deriving Default would be allowed if:

fields.all(|field| field.default_value.is_some()) || (fields.all(|field| field.type.implements(Default)) && fields.all(|field| field.default_value.is_none()))

So if all fields have defaults, the Default impl would use all the default values. If no fields have default values, the implementation of Default would use the implementation of Default for all the fields.

scottmcm commented 1 month ago

You mentioned associated types in another post. Here's an example of a "tricky" struct using them:

That's just doing both the bound-the-parameter and the bound-the-field cases I mentioned. Bound the whole field certainly works, though I'm surprised it bothers doing that. I guess it wanted more "well it just doesn't exist" cases and fewer "it fails to compile" cases. (I find that odd, because a field of type Foo<T> might be just a much not-default as a field of type <T as Blah>::Bar, so I don't see why it'd bound one but not the other.)

Note that this is also relevant to #3683. If you write: […] You'd like your derive to assume T: Default, not T: Default, U: Default.

Wouldn't you want Option<T>: Default?

tmandry commented 1 month ago

@PoignardAzur

I agree we can probably come up with a rule that works based on parsing the struct. The real concern is that it can introduce semver hazards to be too clever about when a derive is implemented.

The expansion you posted is interesting; it demonstrates that we already do this to some extent. That lends some credibility to the idea that it will "probably be fine" to do this in practice, but it still makes me nervous. I would rather not increasingly require library maintainers to rely on tooling to catch subtle instances of semver breakage like this.

As a sketch, I think if we had some combination of

#[derive(Default where T: Default)]

for adding bounds, and a semver_hazards lint that fires when your derives miss bounds in a way that creates hazards, we could get away with always doing perfect derive (probably over an edition). But that's a lot of work that needs actual semantic analysis and pretty out of scope for this RFC.

crumblingstatue commented 1 month ago

My personal problem with assigning default values at struct-definition site is that it conflates data with behavior.

There is no reason why a struct couldn't have multiple sets of defaults. An example would be something like egui::Visuals, which has "dark" and "light" defaults.

In my opinion, when exploring alternatives to this RFC, we should strongly consider ones that detach the concept of a "set of defaults" from the struct definition itself. In my opinion the place for defining a set of defaults should be in an impl block.

For those who would bring up #[derive(Default)] on structs (and helper proc macro crates for smart-derives that help define values), I would like to remind you that it's just syntax sugar for implementing a trait.

Functions already provide a decent way to define a set of defaults, and they can be used with the struct update syntax (..default()), as acknowledged by the RFC.

One of the main problems is not being able to "delegate" the defaults for fields you don't want to explicitly define. This is most often because the types of the fields themselves have sensible defaults, e.g., they implement the Default trait. But there is currently no way to "delegate" the default values to the Default impl of each field.

impl MyStruct {
    const fn my_defaults() -> Self {
        Self {
            field1: Default::default(),
            field2: Default::default(),
            field3: Default::default(),
            field4: Default::default(),
            field5: Default::default(),
            field6: Default::default(),
            field7: Default::default(),
            field8: Default::default(),
            // I only care about explicitly defining this
            field9: 42,
            field10: Default::default(),
            field11: Default::default(),
            field12: Default::default(),
            // And this
            field13: "Explicitly set",
            field14: Default::default(),
            field15: Default::default(),
            field16: Default::default(),
            field17: Default::default(),
            field18: Default::default(),
            field19: Default::default(),   
        }
    }
}

If there was a way to delegate fields you're not interested in explicitly setting to a trait method, that would solve a large chunk of the motivation of this RFC.

impl MyStruct {
    const fn my_defaults() -> Self {
        Self {
            // I only care about explicitly defining this
            field9: 42,
            // And this
            field13: "Explicitly set",
            // Pseudo syntax for delegating defaults
            for .. use Default::default
        }
    }
}

I understand that part of the motivation of this RFC is usability in const contexts, which would make my proposal above depend on const trait impls (Like being able to constly impl Default::default), but in my opinion it would be worth waiting for this feature.

There are other possible alternatives for defining default-sets for a type, without having to tie the defaults to the definition of said type. I strongly suggest exploring such alternatives.