JelteF / derive_more

Some more derive(Trait) options
MIT License
1.71k stars 121 forks source link

`affix` is confusing with outer-level `enum` display formatting #142

Closed ErichDonGubler closed 2 months ago

ErichDonGubler commented 4 years ago

Currently, when one uses a top-level display attribute for the Display derive on an enum, it switches to an affix mode that defines the enum's overall format and expects, at most, a single placeholder where inner variant display logic can print into:

#[derive(Display)]
#[display(fmt = "Prefix {} Suffix")] // at most 1 placeholder allowed here
enum Enum {
    #[display(fmt ="value")] // prints "Prefix value Suffix"
    Variant1,
    // ...
}

However, there's a few problems with this current behavior:

I think that we can do better here. There are enough surprises with the current behavior that I think exploring alternatives is warranted. Here's what we stand to gain:

So, here's my proposal:

To concretely translate my above suggestions into code:

use derive_more::Display;

// Using affixes at the top level of an `enum` is the same as before, except we
// use an `affix` key in the `display` attribute instead of `fmt`.
#[derive(Display)]
#[display(affix = "<{}>")]
enum UsesAffix {
    #[display(fmt = "a")]
    A, // prints "<a>"
    #[display(fmt = "b")]
    B, // prints "<b>"

    // This makes no sense and is still disallowed.
    // #[display(affix = ...)]
    // C,

    // ...
}

// A top-level `display(fmt = ..., ...)` now is:
//
// * Just like variant-level `display` attributes in that it uses formatting
//     args as usual. Only `self` is bound in arguments' context, of course.
// * Overridable by a variant-level `display` attribute.
//
// One motivating example for this is to just use `Debug` printing
// by default unless we happen to have something better.
#[derive(Display)]
#[format(fmt = "{:?}", self)]
enum DisplayDefaultsToDebug {
    A, // prints "A"
    #[display(fmt = "Display B")]
    B, // prints "Display B"
}

I'm working on a draft PR for this here. I'm speculatively doing work without getting validation in this issue first, but I don't mind. :)

JelteF commented 1 year ago

(sorry for not responding earlier to this)

@ilslv @tyranron @ModProg Given we're working towards 1.0 and we're planning to change some of the Display API in breaking ways with #217 anyway, I feel like we should consider and fix this weird edge case too. My notes on the proposal from @ErichDonGubler above are as follows:

  1. I don't really like the word affix, and I think if we do it this way we should also error very clearly if there's more than one {} part in the string. One way to solve that would be to add support for #[display(suffix("something ")) and #[display(prefix(" something more"))] instead of a shared "affix" string.
  2. The default fmt string makes sense, but given the discussion in #217 it shouldn't use fmt anymore but should be the plain string.
  3. Does the default fmt string handle interpolation at all? I would think it would be best to disallow it (at least for the first version of this behaviour).
ModProg commented 1 year ago

We could also make the specific variant format availble via some variable name so you can use it via {variant_fmt} that would make the behavior clear and also support e.g. using it twice.

ModProg commented 1 year ago

I would prefere using an affix style fmt as it is more flexible than just prefix suffix and also more readable conserning white space etc., but I'd be open on the naming.

ModProg commented 1 year ago

Would you be able to access the enum variants values via e.g. _0 - _1? Could be useful for enums very every variant has the same format

tyranron commented 1 year ago

@JelteF @ilslv

My first reaction to this was to remove affix functionality completely, just because it's so rare feature, that I've realized it exists just now, despite the fact I was improving Display macro implementation earlier 😄

However, given it a thought, I do see a lot of usefulness having it, and even see places in my codebases where it may simplify things or allow removing manual impl Display. So, I do prefer to keep it and give it a solid design.

I also do think, that derive(thiserror::Error) did a poor design choice by making top-level enum attribute acting as a default for variants:

use thiserror::Error;
#[derive(Debug, Error)]
#[error("error")]
enum Enum {
    A, // displays "error"
    #[error("b")]
    B, // displays "b"
}

The initial design choice of derive_more::Display where the top-level enum attribute provides top-level formatting seems to be more obvious and consistent with struct case:

use derive_more::Display;
#[derive(Display)]
#[display(fmt = "<{}>")]
enum UsesAffix {
    #[display(fmt = "a")]
    A, // prints "<a>"
    #[display(fmt = "b")]
    B, // prints "<b>"
}
#[derive(Display)]
#[display(fmt = "<{_0}>")]
struct Foo(u8);

While I do agree with this claim:

  • It's REALLY surprising to discover that fmt works rather differently than anywhere else here. It's not obvious why formatting should be different here, and there's no "obvious" signal that behavior is different. I feel qualified to say this since I implemented this behavior myself a year ago, and still was confused after reading diagnostics until I remembered what I had implemented.

That the form #[display(fmt = "<{}>")] doesn't look like valid Rust, so shouldn't exist.

Returning to derive(thiserror::Error) I think that a better design option would be something like this:

use derive_more::Display;
#[derive(Display)]
#[display(default("error"))]
enum Foo {
    A, // prints "error"
    #[display("b")]
    B, // prints "b"
}

Another point here, that I do see defaulting and top-level affixing being quite orthogonal. It seems fully OK (and even desired) to have both at the same time:

use derive_more::Display;
#[derive(Display)]
// Look, ma! No weird syntax anymore!
#[display("Foorror: {self}! And {}", "arbitrary_expression".to_owned())]
#[display(default("error"))]
enum Foo {
    A, // prints "Foorror: error! And arbitrary_expression"
    #[display("b")]
    B, // prints "Foorror: b! And arbitrary_expression"
}

The question is: do we want to be semantically compatible with derive(thiserror::Error)?

If yes, then for compatibility, we only need to invert attribute markers. If we don't like affix naming (I don't like it too), we may use simpler wrap, for example:

use derive_more::Display;
#[derive(Display)]
#[display(wrap("Foorror: {self}! And {}", "arbitrary_expression".to_owned()))] 
#[display("error")]
enum Foo {
    A, // prints "Foorror: error! And arbitrary_expression"
    #[display("b")]
    B, // prints "Foorror: b! And arbitrary_expression"
}

As for me, I'd like to use default despite being incompatible with derive(thiserror::Error), because it seems to me less subtle and more clever/consistent.


@ModProg

We could also make the specific variant format availble via some variable name so you can use it via {variant_fmt} that would make the behavior clear and also support e.g. using it twice.

Just using self should be totally enough and obvious for rustaceans.

I would prefere using an affix style fmt as it is more flexible than just prefix suffix and also more readable conserning white space etc., but I'd be open on the naming.

I do agree here. Prefix/suffix is just not that powerful, because we may want to repeat the inner twice, for example.

Would you be able to access the enum variants values via e.g. _0 - _1? Could be useful for enums very every variant has the same format

Yup, I'd propose it to work in the most obvious and clever way. So _0/_1 or named fields are allowed in both default and wrapping top-level attributes. If some variant doesn't have it - we always may throw an error (and even if we don't, the compiler will do that for us).

ModProg commented 1 year ago

@tyranron I agree with everything you said here.

I'd also prefer default.

One helpful error could be when you specify a non default fmt string and neither specify self nor access self in any of the format arguments. Like telling you to use default to specify the default formatting instead.

tyranron commented 1 year ago

@ModProg

One helpful error could be when you specify a non default fmt string and neither specify self nor access self in any of the format arguments. Like telling you to use default to specify the default formatting instead.

I think this would be too limiting, disallowing to express patterns like this:

#[derive(Display)]
#[display("My app errored: {_0}")]
enum MyError {
    Io(io::Error, PathBuf),
    Serde(serde_json::Error, String),
}

While I understand the possible caveats like this:

#[derive(Display)]
#[display("My app errored: {_0}")]
enum MyError {
    Io(io::Error, PathBuf),
    #[display("serde failed on input `{_1}`: {_0}")]    // won't be displayed
    Serde(serde_json::Error, String),
}

Maybe it makes sense to throw such error when both self is not used and at least one variant has custom formatting? 🤔

ModProg commented 1 year ago

@ModProg

One helpful error could be when you specify a non default fmt string and neither specify self nor access self in any of the format arguments. Like telling you to use default to specify the default formatting instead.

I think this would be too limiting, disallowing to express patterns like this:

#[derive(Display)]
#[display("My app errored: {_0}")]
enum MyError {
    Io(io::Error, PathBuf),
    Serde(serde_json::Error, String),
}

While I understand the possible caveats like this:

#[derive(Display)]
#[display("My app errored: {_0}")]
enum MyError {
    Io(io::Error, PathBuf),
    #[display("serde failed on input `{_1}`: {_0}")]    // won't be displayed
    Serde(serde_json::Error, String),
}

Maybe it makes sense to throw such error when both self is not used and at least one variant has custom formatting? thinking

But why wouldn't you use default in that case?

#[derive(Display)]
#[display(default("My app errored: {_0}"))]
enum MyError {
    Io(io::Error, PathBuf),
    Serde(serde_json::Error, String),
}
tyranron commented 1 year ago

@ModProg

But why wouldn't you use default in that case?

#[derive(Display)]
#[display(default("My app errored: {_0}"))]
enum MyError {
    Io(io::Error, PathBuf),
    Serde(serde_json::Error, String),
}

meme_firefly_wait

I cannot give here any meaningful reason, except "because I don't care and only want to specify formatting for the whole thing?". For the inexperienced user, it seems meaningful to provide default only when exceptions appear, and when there are none he wouldn't even think about default?

Is this a good motivator to prefer inverted wrap marker and derive(thiserror::Error) compatibility? 🤔

(I still do tend to use default + error on non-self + variant)

ModProg commented 1 year ago

For the inexperienced user, it seems meaningful to provide default only when exceptions appear, and when there are none he wouldn't even think about default?

That makes sense. And there is really no difference to a wrap without a reference to self and a default (as long as there are no variant specific formats).

Technically, we could also make a format the default, as long as it does not reference self. And then it becomes wrapping... and you need to specify the default for the variants not specified as default.

i.e.

// Is the default
#[derive(Display)]
#[display("My app errored: {_0}")] //  item level attribute (default)
enum MyError {
    Io(io::Error, PathBuf),
    #[display("serde failed on input `{_1}`: {_0}")]    // will be displayed instead of item level attribute
    Serde(serde_json::Error, String),
}

// References self therefor is nolonger the default
#[derive(Display)]
#[display("My app errored: {self}")] //  item level attribute (wrap)
#[display(default("{_0}"))]  // will be displayed inside of item level attribute
enum MyError {
    Io(io::Error, PathBuf),
    #[display("serde failed on input `{_1}`: {_0}")]    // will be displayed inside of item level attribute
    Serde(serde_json::Error, String),
}

Not 100% sure on that though as it feels a bit magically, but as we need to detect and replace self anyway there is no additional risk associated with this (I think™).

ilslv commented 1 year ago

@tyranron @JelteF I think my argument about discoverability applies here as well. If we support following:

// derive_more::From fits nicely into this pattern as well
#[derive(Debug, Display, Error, From)]
enum CompoundError {
    Simple,
    WithSource {
        source: Simple,
    },
    #[from(ignore)]
    WithBacktraceFromSource {
        #[error(backtrace)]
        source: Simple,
    },
    #[display(fmt = "{source}")]
    WithDifferentBacktrace {
        source: Simple,
        backtrace: Backtrace,
    },
    WithExplicitSource {
        #[error(source)]
        explicit_source: WithSource,
    },
    #[from(ignore)]
    WithBacktraceFromExplicitSource {
        #[error(backtrace, source)]
        explicit_source: WithSource,
    },
    Tuple(WithExplicitSource),
    WithoutSource(#[error(not(source))] Tuple),
}

I don't see why we shouldn't support this:

// derive_more::From fits nicely into this pattern as well
#[derive(Debug, Display, Error, From)]
#[display("Compound error: {}")]
enum CompoundError {
    Simple,
    WithSource {
        source: Simple,
    },
    #[from(ignore)]
    WithBacktraceFromSource {
        #[error(backtrace)]
        source: Simple,
    },
    #[display(fmt = "{source}")]
    WithDifferentBacktrace {
        source: Simple,
        backtrace: Backtrace,
    },
    WithExplicitSource {
        #[error(source)]
        explicit_source: WithSource,
    },
    #[from(ignore)]
    WithBacktraceFromExplicitSource {
        #[error(backtrace, source)]
        explicit_source: WithSource,
    },
    Tuple(WithExplicitSource),
    WithoutSource(#[error(not(source))] Tuple),
}

And I think that single empty placeholder {} is clearer than {self}, because it would break our "exactly the same formatting, as std::fmt" assumption.

ModProg commented 1 year ago

And I think that single empty placeholder {} is clearer than {self}, because it would break our "exactly the same formatting, as std::fmt" assumption.

But I'd argue that #[display("Compound error: {}, {self}", some_expression())] should at least be supported on top.

And is {} really "exactly the same as std::fmt", as in normal fmt {} always requires an argument to be present.

I wouldn't be opposed to have the behavior of: "If there are no additional arguments passed to display handle {} as {self}", but I'm not convinced it really matches the std::fmt behavior better.

ilslv commented 1 year ago

@ModProg

But I'd argue that #[display("Compound error: {}, {self}", some_expression())] should at least be supported on top.

Isn't #[display("{self}")] a recursive call?

ModProg commented 1 year ago

Isn't #[display("{self}")] a recursive call?

True. So maybe variant? But that would collide with a field called variant.

Is there any use for a recursive Display? Atleast in the usecase of this macro?

ilslv commented 1 year ago

@ModProg

Is there any use for a recursive Display?

No, I don't think so.

True. So maybe variant?

As I mentioned before I'm against "magical" names.

ModProg commented 1 year ago

@ilslv So I'd say self is our best choice then. As, the recursive usage doesn't make any sense.

I am not opposed to have {} work as well. But replacing self with the display of the variant is a feature allows a bit more flexibility here.

JelteF commented 1 year ago

I think there's been enough time to think on this. What is everyone's opinion at this time? I definitely think we should implement this for the 1.0 release, since changing it later would be a breaking change.

ModProg commented 1 year ago

My position hasn't really changed. I still think we should allow {self} to reference the variants formatting. I am not opposed to supporting {} for that as well, as long as we support #[display("Compound error: {}, {self}", some_expression())].

We should have a way of specifying both a default and a wrap, but I'd be fine with making a format that doesn't reference the variants formatting (either through {} or {self}) the default. (https://github.com/JelteF/derive_more/issues/142#issuecomment-1319024057)

ErichDonGubler commented 1 year ago

After some reflection, some real-world use, and considering @ModProg's comment, I'm concerned that all prior proposals actually don't go far enough in breaking from current and OP-proposed syntax, which has problems. Even the already implemented design space is too complex to rely on a single fmt tag. Some reactions to my OP and others' commentary here, two years later:

  1. I agree with @ModProg's comment: an implicit {} group at the item level (fmt = "...{}...") that switches between providing a default or providing an affix based on the lack of a positional arg is confusing. [std::fmt] would ordinarily fail to compile with {} and no args. We should take an opportunity to move away from it, if we have it.
  2. All previous placeholder suggestions ({}, {variant}, {self}) are unfortunately confusable with a "real" [std::fmt] argument.
  3. Experience report:anyhow`'s prior art has generally been confusing both to myself and to teammates on projects, even over time after having written the code ourselves. Examining the current affix proposal, its behavior changes pretty dramatically based on the content of the formatting args. In other words, I'm swayed by @tyranron's arguments.
  4. Using derive_more::Display as a dependency tends to initially be a personal choice, I've noticed. When derive_more feels approachable, it's a lot easier to adopt as an extra dep. Optimizing the syntax for clarity in reading seems like the best way to avoid ecosystem friction overall.

Will post another comment here with details about a solution I think will solve the problems we're debating about here. 🙂

ErichDonGubler commented 1 year ago

Designing from scratch

Let's enumerate the behaviors we have expressed that we want:

  1. Item level: at most one of each:
    1. Default variant formatting (currently exposed via fmt = ...)
    2. Affix formatting (currently exposed as {})
  2. Variant level: at most one of any:
    1. Non-overriding (currently exposed as fmt = ...)
    2. Overriding (not exposed anywhere currently, but desirable)

Display's implementation from before I added affix functionality only had (1i) and (2ii). There was only a single behavior per level of fmt = ... prior to that PR. The addition of item-level affix backwards-compatibly (originally because @JelteF wanted backwards compatibility) has caused some confusion. Two years later, I think it's safe to say that our current design is holding us back more than what backwards compatibility might gain us.

I think the best way to offer the above behaviors going forward would involve new syntax that fulfills these requirements:

  1. Use an affix placeholder that:
    1. Clearly does not compile with Rust's vanilla fmt syntax, signaling that it's an affordance of this macro, and not anything from fmt (this resolving concerns from @ilslv about "magical" names)
    2. Is obviously not associated with any positional formatting arg (resolving concerns from @ModProg1 and myself
    3. Communicates that it will delegate formatting to variants
  2. Make it clear which item-level behavior is being used
  3. Make it clear which variant-level behavior is being used
  4. Avoid problems in a migration from old syntax (i.e., everybody uses fmt = … right now, let's not risk changing its meaning and let old code compile)
  5. Allow variant-level overriding of item-level affix, per my OP.
  6. Makes it easy to copy/paste from existing {write,format,println,…}!(…) by using parentheses around format specs, instead of the equals syntax we've historically used. This one is optional but I thought it'd be a nice-to-have.

Each of the current proposals fails to fulfill these in one way or another. Let me prove it to you with a table (and, *ahem*, some new contenders):

Proposals\Requirements 1i: affix placeholder clearly not part of std::fmt syntax 1ii: affix placeholder clearly not a positional arg 1iii: affix placeholder clearly delegates to variants 2: item-level behavior is clear 3: variant-level behavior is clear 4: avoids migration problems 5: easy to adjust copy/paste vanilla std::fmt macro usage
@erichdongubler's OP proposal:
  • Item:
    • Default: fmt = "…", … (no unqualified positional, a la `anyhow`)
    • Affix: affix = "…{}…", … (single unqualified positional)
  • Variant:
    • Overriding: does not exist
    • Non-overriding: fmt = …
@ModProg's proposal:
  • Item:
    • Default: fmt = "…", …
    • Affix: fmt = "…{self}…", …
  • Variant:
    • Overriding: does not exist
    • Non-overriding: fmt = "…", …
⚠️ (self's clarity is up for debate)
@erichdongubler's new proposal, Equals Edition™ (spoiler alert!):
  • Item:
    • Default: default = "…", …
    • Affix: item = "…{@variant}…", …
  • Variant:
    • Overriding: override = "…", …
    • Non-overriding: variant = "…", …
@erichdongubler's new proposal, Parens Edition™ (spoiler alert!):
  • Item:
    • Default: default("…", …)
    • Affix: item("…{@variant}…", …)
  • Variant:
    • Overriding: override("…", …)
    • Non-overriding: variant("…", …)

You may have noticed that I snuck in a couple of new solutions at the end! Yes, I believe that we can fulfill all of the above requirements. 🤪 I propose we move to have distinct tags for each of the numerous flavors of format spec that make behavior unambiguous in all cases, and therefore easier to use overall.[^1] Let me give an applied example, rewriting last example in the original OP in the Parens Edition™ syntax:

#[derive(Display)]
#[display(item("<{@variant}>"))] // 1ii: affix formatting
#[display(default("unknown"))] // 1i: default variant formatting
enum UsesAffix {
    #[display(variant("a"))] // 2i: non-overriding formatting
    A, // prints "<a>"
    B, // prints "<unknown>"
    #[display(override("{c}"))] // 2ii: overriding formatting
    c, // prints "{C}"
}

...or, with the Equals Edition™:

#[derive(Display)]
#[display(item = "<{@variant}>")] // 1ii: affix formatting
#[display(default = "unknown")] // 1i: default variant formatting
enum UsesAffix {
    #[display(variant = "a")] // 2i: non-overriding formatting
    A, // prints "<a>"
    B, // prints "<unknown>"
    #[display(override = "{c}")] // 2ii: overriding formatting
    c, // prints "{C}"
}

Cases needing errors or warnings become much more clear, too; using Parens Edition™ notation:

#[derive(Display)]
#[display(item("\\{@variant}/"))]
enum NoDefaultProvided {
    #[display(variant("1"))]
    One, // prints `\1/`
    Two, // error: no `default` provided by item level, no `variant` specified here
}
#[derive(Display)]
enum NoDefault {
    #[display(variant("Left"))]
    Left, // prints `Left`
    Right, // error: no `default` provided by item level, no `variant` specified here
}
#[derive(Display)]
#[display(item("oops, I meant to write a default"))] // maybe throw an error: no `@variant` formatting arg was used
enum ShouldaDoneDefault {
    #[display(variant("Red"))] // otherwise, warning: this is ignored because `item` has no `@variant` placeholder, which probably wasn't intended! Maybe suggest switching `item` to `default`, or using the placeholder.
    Red, // prints `oops, I meant to write a default`
}
#[derive(Display)]
enum PersonKind {
    #[display(variant("dog"))]
    Dog, // prints `dog`
    #[display(override("cat"))] // warning: trying to `override` an `item` that doesn't exist
    Cat,
}

EDIT: Added the PersonKind/override warning diagnostic case.

[^1]: Actual identifiers associated with the tags proposed here are subject to bikeshedding.

ModProg commented 1 year ago

I'd prefer the paren syntax. I agree that something like @variant cannot be confused with meaning something else (we could also use something like just @ not sure how much sense that makes).

Some proposed changes to paren:

  1. @ instead of @variant, (as long as it is the only @ reference, I know I break this my self with 2.)
  2. add @name for default, variant and override (I don't think it makes much sense for item, and it might be ambiguous for default and item)
  3. Make remove inner "call" for variant and item
#[derive(Display)]
#[display("affix: {@}")] // errors if it doesnt contain an `@`
#[display(default("the {@name} {_0}"))] // could error if every variant has a formatting
enum Enum {
  Default(usize),
  #[display("{@name} formatting")]
  Custom.
  #[display(override("Likely only very seldom used")] // could error if no "affix" exists
  Override
}

We also need to decide how this all plays together with the default display for Unit and Newtype and with the proposed rename_all #216.

ErichDonGubler commented 1 year ago

Reexamining my proposal(s) above, I don't think I've given @tyranron enough credit; their solution was, at least, a partial inspiration for the Parens Edition™ version of my above proposal. Kudos!

ErichDonGubler commented 1 year ago

@ModProg: RE: your suggestions:

ModProg commented 1 year ago

Can you explain why you think those forms might be better?

Just personal taste, as I felt like those are the ones that you most likely want. Similar to how {} is just a shorthand for {idx} in rust's fmt.

  • (2) seems like a separate feature request, which I think would best be served by being a separate issue (CC @JelteF). It's interesting that this sort of design space opens up with a sigil like @, though!

yeah, I just wanted to mention it, so we can build the rest in a way that doesn't prohibit such features.

erratic-pattern commented 1 year ago

Just wanted to add my use case for consideration since it currently doesn't seem to be supported.

I have a lot of simple enums that I serialize to JSON strings. I would like to derive Display implementations directly from the Serialize implementation so that the seride(rename_all = ...) attribute that I use is preserved when writing with Display. I was hoping to be able to write something like this, usingserde_plain::to_string to generate plaintext names:

#[derive(Deserialize, Serialize, Eq, PartialEq, PartialOrd, Ord, Clone, Copy, Debug, Display)]
#[serde(rename_all = "lowercase")]
#[display(fmt = "{}", "serde_plain::to_string(&self).unwrap()")]
pub enum TestOs {
    Alpine,
    RockyLinux,
    Debian,
    Ubuntu,
    Macos,
    Windows,
}

However the display(fmt = "...", args...) form is not supported at the top-level of an enum with variants. Currently I see no way to use derive_more for this and so I'm using the derive_display_from_serialize! function macro from serde_plain. I would prefer a more idiomatic derive macro for this purpose rather than a function macro, but haven't found any good alternatives so far.

JelteF commented 1 year ago

I agree with @ModProg that I think we should not require (but maybe allow) the variant(...) "call". The variant call seems quite verbose for what seems to be the most common usecase. Especially since I would not want to require a variant or override call when there's no item call at the top.

The item call I feel much less strong about removing. But I do think it's important to throw an error if there's no @variant in the item call, and probably hint to the existince of the default call. If that's what we'll do I don't see a strong need for having an item call, as the pressence of @variant would already show that it's not the default. But like I said I'm also fine with having an item call if others disagree.

JelteF commented 1 year ago

Also I like the @ sigil for this. But I think we should always have a name after it for clarity, so no bare @ but @variant.

@name seems like a separate feature indeed. And I would like to keep it's discussion out of this issue, because this is one of the issues that's currently blocking 1.0. And adding extra sigils wouldn't be breaking changes.

tyranron commented 1 year ago

@JelteF @ilslv @ErichDonGubler @ModProg

First of all, I do like an idea of having override very much, but see this more an addition, rather than a requirement, and so, can be implemented post 1.0, as is not breaking in any way.

3. Make remove inner "call" for variant and item

Removing the explicit tags optimizes for writing, but causes clarity to suffer, IMO. Can you explain why you think those forms might be better?

I think we should not require (but maybe allow) the variant(...) "call". The variant call seems quite verbose for what seems to be the most common usecase. Especially since I would not want to require a variant or override call when there's no item call at the top.

I agree with removing them too. Having too much of specifiers makes discoverability considerably worse (point often protected by @ilslv). It's hard to remember them, and it's unlikely to write something meaningful without advising with documentation. It should be easy to write stuff out-of-the-box in the intuitive way.

Like I want to format a variant or a struct, I just write #[display("my format {field}")] like I do in a plain code format!("my format {field}"), without thinking about affixes, overriding, defaulting as long as I don't need it. Once I need some decoration on top of enum variants, it's natural for me to try top-level attribute for that #[display("wrapped <{self}>")]. Then, realizing that I don't want to have this for some variant, it would be quite obvious to specify #[display(override("no decoration"))]. All this follows natural thinking patterns and may be reasonably discovered without diving into docs, and so, also is quite easy to be remembered.

Also I like the @ sigil for this. But I think we should always have a name after it for clarity, so no bare @ but @variant.

This is something I want to debate quite much about.

I don't really like the idea to introduce something new to std::fmt syntax, because:

I still do think that the self is the way to go here. Let's have a closer look at it:

But I do think it's important to throw an error if there's no @variant in the item call, and probably hint to the existince of the default call.

I don't think that's worth it. From the user's perspective, I don't see much motivation why I should write something like this (referring to @kallisti-dev example):

#[serde(rename_all = "lowercase")]
#[display(default("{}", serde_plain::to_string(&self).unwrap()))]
pub enum TestOs {

rather than just this:

#[serde(rename_all = "lowercase")]
#[display("{}", serde_plain::to_string(&self).unwrap())]
pub enum TestOs {

It's quite unlikely I would think in this situation about default, and introducing it seems strange for a reader.

Even more, potentially we can support not only pure {self}, but self in arbitrary expressions like #[display("<{}>", MySuperFormat(self))], without introducing a recursive call just by doing something like this in macro expansion:

fn fmt(&self, f: fmt::Writer<'_>) -> fmt::Result {
    let __derive_more_self = format_args!(/* non-decorated formatted stuff */);
    write!(f, "<{}>", MySuperFormat(__derive_more_self))
}

The only real issue here with self I see, that when it comes to arbitrary expressions, there is no way to distinguish between self as "pure &self itself" and self as "not-yet-decorated formatted stuff", because substituting self in an expression like this #[display("{}", serde_plain::to_string(&self).unwrap())] would give not what we expect. This returns us back to the @ sigil question, proving that @variant and self semantics are not quite the same in edge cases.

Fine then... Rather than introducing a new @ sigil formatting syntax, we better reserve a concrete identifier for that, like __self (or __variant, or __preformatted, or __formatted, or __fmt, or __output, etc). This preserves us from extending fmt grammar (and consequences of doing so), while allows the desired semantics separation with self. However, still obscures the feature, making it undiscoverable.

Let's sketch the final design I propose in examples:

#[derive(Display)]
#[display("what {self}?"]
struct AnyStruct1;  // complile error: recursive call

#[derive(Debug, Display)]
#[display("what {self:?}?"]
struct AnyStruct2;  // prints: what AnyStruct2?

#[derive(Debug, Display)]
#[display(default("what {self:?}?")]
struct AnyStruct3;  // compile error: `default` is not allowed here, remove it

#[derive(Debug, Display)]
#[display(override("what {self:?}?")]
struct AnyStruct4;  // compile error: `override` is not allowed here, remove it

#[derive(Display)]
#[display("what {__self}?"]
struct AnyStruct5;  // compile error: reserved identifier `__self` is not allowed here

#[derive(Display)]
enum Enum1 {
    Foo,  // prints: Foo
    #[display("Baz")]
    Bar,  // prints: Baz
}

#[derive(Display)]
#[display(default("default"))]
enum Enum2 {
    Foo,  // prints: default
    #[display("Baz")]
    Bar,  // prints: Baz
}

#[derive(Display)]
#[display("decorate <{__self}>")]
#[display(default("default"))]
enum Enum3 {
    Foo,  // prints: decorate<default>
    #[display("Baz")]
    Bar,  // prints: decorate<Baz>
}

#[derive(Display)]
#[display("decorate <{__self}>")]
enum Enum4 {
    Foo,  // prints: decorate<Foo>
    #[display(override("Baz"))]
    Bar,  // prints: Baz
}

#[derive(Display)]
#[display("whatever")]
enum Enum5 {
    Foo,  // prints: whatever
    Bar,  // prints: whatever
}

#[derive(Display)]
#[display("whatever {__self}")]
enum Enum6 {
    Foo,  // prints: whatever Foo
    Bar,  // prints: whatever Bar
}

#[derive(Display)]
#[display("whatever {}", append_bang(__self))]
enum Enum6 {
    Foo,  // prints: whatever Foo!
    Bar,  // prints: whatever Bar!
}

#[derive(Display)]
#[display("whatever {self}")]
enum Enum7 {  // complile error: recursive call
    Foo,
    Bar,
}

#[derive(Debug, Display)]
#[display("whatever {self:?}")]
enum Enum8 {
    Foo,  // prints: whatever Enum8::Foo
    Bar,  // prints: whatever Enum8::Bar
}

#[derive(Display, Serialize)]
#[display("whatever {}", serde_json::to_string(self).unwrap())]
#[serde(rename_all = "lowercase")]
enum Enum9 {
    Foo,  // prints: whatever "foo"
    Bar,  // prints: whatever "bar"
}

#[derive(Display)]
#[display("whatever {}", serde_json::to_string(&__self).unwrap())]
enum Enum10 {
    Foo,  // prints: whatever "Foo"
    #[display("Baz")]
    Bar,  // prints: whatever "Baz"
}

This feels quite intuitive and natural for me, except the __self naming of the reserved identifier. I don't have better naming at the moment, but it should be much more intuitive and clearer, and even if it can collide with some existing identifiers, we always may disambiguate it like this:

#[derive(Display)]
#[display("field: {}, full: {__self}", self.__self)]
enum Enum10 {
    #[display("some foo")]
    Foo { __self: u8 },  // prints: field: <u8>, full: some foo
    #[display("bar {__self}")]
    Bar { __self: String },  // prints: field: <String>, full: bar <String>
}
ModProg commented 1 year ago

This feels quite intuitive and natural for me, except the __self naming of the reserved identifier.

I agree, that doesn't really feel amazing, but it would be inline with e.g _0 for tuple structs, (though I think we can reduce to a single underscore).

and even if it can collide with some existing identifiers, we always may disambiguate it like this:

Another example for a (for whatever reason named constant)

#[display("{}", ::module::__self]

I agree that we should keep the plausible scope of this derive macro in mind. If the expression gets too complex, it is probably a good idea to implement Display manually.

Therefor, I would also be fine with saying {self} inside the format string is the variant's Display, and self inside the arguments is the actual self. Though, that either removes the possibility to access the variant's format or would create both _self and self which is probably undesirable.

I don't really like the idea to introduce something new to std::fmt syntax, because:

* We don't have control over it. It easily may be changed in future in the way conflicting with our additions semantically, causing us not only to do redesign, but also causing existing library users' code to become semantically incompatible.

I find that highly unlikely, (though it wouldn't be a breaking change for std, as they currently only allow identifiers, so it makes sense to be rather safe than sorry here).

erratic-pattern commented 1 year ago

I agree here with many of the points that @tyranron makes.

The default(...) specifier feels odd in the case where you're only specifying a top-level attribute. I think in cases where there are per-variant attributes, it makes more sense, such as in:

#[derive(Display)]
#[display("decorate <{__self}>")]
#[display(default("default"))]
enum Enum3 {
    Foo,  // prints: decorate<default>
    #[display("Baz")]
    Bar,  // prints: decorate<Baz>
}

Also +1 to keeping the syntax as close as possible to existing formatting conventions, rather than introducing new symbols.

As far as default vs variant naming, I'm not sure which is more readable, as I can see arguments for each.

@ModProg :

I agree that we should keep the plausible scope of this derive macro in mind. If the expression gets too complex, it is probably a good idea to implement Display manually.

I admit my particular use case is not much different from a manual Display implementation, only saving a few lines of code really, and so may not be worth consideration here if it complicates the formatting syntax.

The main motivation in my case for wanting to derive Display in this way is that I have many of these enums which follow the same pattern, and I'd like to reduce the boilerplate as much as possible to a simple line of code that says "invoke this expression on my enum to get the Display representation".

I think in code that works with HTTP APIs this is probably a common scenario, where you already have serde attributes that describe a textual representation and you'd like to have a Display that's consistent with that representation. I would say with the ubiquity of serde this is probably an integration worth considering in itself (behind a feature gate).

if there's a unifying and elegant design here, I say go for it, but otherwise I don't think it's worth it if it introduces inconsistencies or confusing semantics that clash with existing usage patterns. The examples proposed by @tyranron seem to be the most consistent/clear so far, with the exception of the _self problem.

It might be worth considering having a completely separate form for this type of derive. In fact, I really don't need a format string at all to express what I'm after here. Something like this would work just as well:

#[display(from(serde_plain::to_string(&self).unwrap()))]

This would have the benefit of not requiring special handling of _self in argument contexts, as I think it would very confusing for the self identifier to have different meanings in different contexts.

JelteF commented 1 year ago

@tyranron thanks for that great argument. After reading I agree with pretty much all you said. The only thing I disagree with is the requiring of default. I think @kallisti-dev made a correct characterisation of the problem for that one (and that made me change my stance):

The default(...) specifier feels odd in the case where you're only specifying a top-level attribute. I think in cases where there are per-variant attributes, it makes more sense, such as in:

I think the default call should only be required if there are #[display] attributes on any of the variants (and _self is not used). Because in that case it becomes really hard to understand if the top level #[display] attribute is meant as a prefix or a default.

I also agree with @ModProg that single underscore would be better than two underscores. To stay in line with _0

tyranron commented 1 year ago

@JelteF @ModProg

I also agree with @ModProg that single underscore would be better than two underscores. To stay in line with _0

After thinking about it a few days, it seems that reserving _inner identifier would be the most clear and intuitive option.

#[derive(Display)]
#[display("decorate <{_inner}>")]
#[display(default("default"))]
enum Enum3 {
    Foo,  // prints: decorate<default>
    #[display("Baz")]
    Bar,  // prints: decorate<Baz>
}

I think the default call should only be required if there are #[display] attributes on any of the variants (and _self is not used). Because in that case it becomes really hard to understand if the top level #[display] attribute is meant as a prefix or a default.

Not sure that I've understood you well. Could you demonstrate the idea with few examples?

Initially, I didn't mean to require default in any way. It's always optional. There is a strict separation in semantics between the usual top-level formatting attribute and the default top-level formatting attribute. First means always "wrapping", second means always "default". And due to them being not fully orthogonal by nature, we may naturally and intuitively express "defaulting" via "wrapping without _inner" in some cases (like no variant-level attributes specified). However, there are still cases where we want both "wrapping" and "defaulting".

ModProg commented 1 year ago
#[derive(Display)]
#[display("decorate <{_inner}>")]
#[display(default("default"))]
enum Enum3 {
    Foo,  // prints: decorate<default>
    #[display("Baz")]
    Bar,  // prints: decorate<Baz>
}

This looks reasonable to me, and we would error if:

JelteF commented 1 year ago

@tyranron

Not sure that I've understood you well. Could you demonstrate the idea with few examples?

Basically this would error:

#[derive(Display)]
#[display("default")]
enum Enum3 {
    Foo,
    #[display("Baz")]
    Bar, 
}

But this would be allowed:

#[derive(Display)]
#[display("default")]
enum Enum3 {
    Foo,  // prints: default
    Bar,  // prints: default
}

And this would also be allowed:

#[derive(Display)]
#[display(default("default"))]
enum Enum3 {
    Foo,  // prints: default
    #[display("Baz")]
    Bar,  // prints: Baz
}

This last one could then be combined with a non default call:

#[derive(Display)]
#[display("decorate <{_inner}>")]
#[display(default("default"))]
enum Enum3 {
    Foo,  // prints: decorate<default>
    #[display("Baz")]
    Bar,  // prints: decorate<Baz>
}
JelteF commented 1 year ago

I'm fine with _inner btw

ilslv commented 1 year ago

@tyranron @JelteF @ErichDonGubler @ModProg

I still feel like we are trying to stuff too much into a single macro. This functionality doesn't bring anything new to the table, so basically only a syntactic sugar to reduce boilerplate. I really like how non-sugary the Rust is and I think this is part of it's idiomatic approach.

Is affix functionality useful in terms of reducing a lot of boilerplate? I would argue that yes. Especially for error types:

// mod io
#[display("IO error: {}")]
enum Error {
    #[display("entity not found")]
    NotFound,
    #[display("permission denied")]
    PermissionDenied,
    // and 38 more variants ...
}

Is default or override useful in terms of reducing a lot of boilerplate? I would argue that no. I don't see any clear real-world use case for it where you wouldn't be better off with just manually specifying format for each variant. Also adding support for them isn't a breaking change, so we can reconsider this decision in the future.


About placeholder I'm still convinced that simple {} is the way to go.

{self} is no-go to me, because it can mean different things in different fmt derives:

#[derive(Display, Pointer)]
#[display("addr: {self}")]
#[pointer("{self}")]
enum Pointer<T> {
    #[display("nullptr")]
    Null,
    NonNull(*const T)
}

In the example above #[display("addr: {self}")] means affix formatting, while #[pointer("{self}")] is intended to simply use Display implementation. And imagine this example without #[display("addr: {self}")], very confusing to me.

All special names are frustrating friction point to me as I struggle to remember them, so _inner/__self/@variant are no-go for me.

The only argument against simple {} was about using it somewhere in between (very strange use-case imho), but format syntax covers it flawlessly:

#[derive(Display)]
#[display("{before} | {} | {after}", before = any_expr, after = any_expr)]
struct Affix {
    // ...
}

So I'm still arguing for solution proposed in https://github.com/JelteF/derive_more/issues/142#issuecomment-1323527179

ilslv commented 1 year ago

@ModProg I don't understand what you are trying to argue about

ModProg commented 1 year ago

@ModProg I don't understand what you are trying to argue about

sorry i think I just misunderstood your comment.

But yeah, I guess if we remove support for positional arguments we can fo it the way you proposed. Or we only make "{0}" the variant format. and allow additional positional arguments, though I'm a bit unsure that that it might not create confusion either way.

JelteF commented 1 year ago

@ilslv

About placeholder I'm still convinced that simple {} is the way to go. All special names are frustrating friction point to me as I struggle to remember them, so _inner/__self/@variant are no-go for me.

While I agree that remembering the special names is a friction point when writing the derive. I think optimizing for readability is probably better in this case. With a special name it's instantly clear what it is that is being inserted there. It also gives you something to google for if you are not familiar with the feature

For instance someone not familiar with the feature might be confused by the this derive

#[derive(Display)]
#[display("My error: {}")]
enum Error {
    #[display("not found error: {source}")]
    NotFound{source: HTTPError}
    OtherHTTPError(HTTPError),
    OtherDNSError(DNSError),
}

I think that syntax can reasonably be interpreted as:

#[derive(Display)]
#[display(default("My error: {_0}"))]
enum Error {
    #[display("not found error: {source}")]
    NotFound{source: HTTPError}
    OtherHTTPError(HTTPError),
    OtherDNSError(DNSError),
}

Instead of its actual meaning:

#[derive(Display)]
#[display("My error: {_inner}")]
enum Error {
    #[display("not found error: {source}")]
    NotFound{source: HTTPError},
    OtherHTTPError(HTTPError),
    OtherDNSError(DNSError),
}

Having the _inner special name makes understanding the derive much easier IMHO. And I think it's probably worth the added cost when writing the derive.

I agree that removing support for positional args would solve the confusion of the syntax for the parser, but I don't think it would help this human confusion.

JelteF commented 1 year ago

@ilslv @tyranron what do you think? I feel like this is probably the final change (together with #248) that we want to do before creating 1.0.0

ilslv commented 1 year ago

@JelteF speaking of 1.0.0, I'm not really comfortable releasing so many changes to so many people without the ability to release breaking changes that easily. I would love to see 1.0 as the last breaking release for a substantial amount of time. And as release candidates mechanism isn't that useful in rust ecosystem, I would like to release 0.100 and maybe even couple of more breaking versions before committing to the 1.0.

JelteF commented 1 year ago

Regarding 1.0.0 vs 0.100.0 The main thing I would want to do is release 1.0.0 soon-ish, because I think derive_more is in a stable state. The first 0.99 release was 3 years ago. My plan at that time was to release 1.0.0 soon after, but I never did. Now we have a ton of breaking changes, which make the syntax more consistent and bring it in line with current Rust versions. After these changes I think we just shouldn't break things for a while again.

All this new code also has a lot of tests (a lot of those are thanks to you). So I'm pretty confident that more breaking changes are not necessary for a while. And if I'm wrong, I don't think it's a big deal to release 2.0 quickly after 1.0 if the amount of breakage is tiny and we break for a good reason (i.e. some crazy bug, not some cosmetic cleanup).

ilslv commented 1 year ago

For me releasing 2.0 soon after 1.0 for such popular and foundation library is a big deal. Ecosystem is still fragmented after the release of syn 2.0 and I think it will be like that for a long time.

JelteF commented 1 year ago

There's no difference in amount of split between doing 1.0 and then 2.0 or doing 0.100 and then doing 1.0. You get this split because there's two major versions. We want to release a 1.0, so by releasing 1.0 directly in the happy path there's only 1 major version, and in the bad case if there's some crazy bug then we have two major versions. But if we start with 0.100 and we still want to release 1.0, then even if we don't need to do any more breaking changes we need to release a new major version (or stay at a 0.x release again for 3 years). So I say, let's just bite the bullet and release 1.0 once we have no obvious breaking changes that we want to do.

ilslv commented 1 year ago

There's no difference in amount of split between doing 1.0 and then 2.0 or doing 0.100 and then doing 1.0

I don't agree with that. People have certain expectations about stability of 1.0. By releasing 0.100 I expect us getting bug reports/ideas from early adopters, that may be very helpful. And after 1.0 I expect slow migration of the rest of the ecosystem.

But if we start with 0.100 and we still want to release 1.0, then even if we don't need to do any more breaking changes we need to release a new major version

I don't see any problems with that.

JelteF commented 1 year ago

There's no difference in amount of split between doing 1.0 and then 2.0 or doing 0.100 and then doing 1.0

I don't agree with that. People have certain expectations about stability of 1.0. By releasing 0.100 I expect us getting bug reports/ideas from early adopters, that may be very helpful. And after 1.0 I expect slow migration of the rest of the ecosystem.

People already have certain expectations about derive_more. It's been stable for 3 years. There's 110k repos on github using derive_more. Suddenly starting to do multiple breaking releases in a row is something I think we should do. And thus I don't think we should plan our versioning to that. Our goal with the next release should be: This is a stable version that we don't plan on changing. It does have new features. But they are tested well. Any bugs that don't require breaking changes we will fix soon. Once that require breakage, will probably have to wait for another 3 years to be fixed.

And in the extremely unlikely case where something is completely broken, and requires a new major version to fix. Then we'll likely know within a few hours/days, which makes the ecosystem split problem pretty much a non-problem.

even if we don't need to do any more breaking changes we need to release a new major version

I don't see any problems with that.

Even if there's no actual breaking changes, cargo will still consider it incompatible versions. Thus doing that will split the ecosystem for no good reason.

glandium commented 1 year ago

You /could/ release a 0.100, wait a few weeks/months to see if there are breakages, and if there aren't any, then release a 1.0 and a 0.100.n that depends on 1.0 and reexports everything from it. Not saying you must, but that would be one way to deal with the situation without having to rush a 2.0 to fix problems that require a breaking change while avoiding an ecosystem split.

(I'll add that a 0.99.18 with syn2 changes only would be very welcome too)

JelteF commented 1 year ago

@ilslv

And as release candidates mechanism isn't that useful in rust ecosystem

Why did you say this. After thinking a bit more I agree that it would be good to have some more testing before actually releasing 1.0.0. But I think the nicest way to do that is to release a 1.0.0-beta.1 version.I think beta.1 signals even better that this is a release on which we want some feedback than a 0.100 would. It also shows up nicely in crates.io, with the version being yellow and defaulting to showing the readme for the last actual release (example: https://crates.io/crates/pgrx/versions)

And I think we'll get enough feedback from people since quite a few people seem to want a release for the syn 2.0 support. What do you think?

JelteF commented 1 year ago

@ilslv Also, what do you think of my previous argument for the _inner special name? Apart from the publishing situation, I think we should make a decision on the original issue from this thread:

While I agree that remembering the special names is a friction point when writing the derive. I think optimizing for readability is probably better in this case. With a special name it's instantly clear what it is that is being inserted there. It also gives you something to google for if you are not familiar with the feature

For instance someone not familiar with the feature might be confused by the this derive

#[derive(Display)]
#[display("My error: {}")]
enum Error {
    #[display("not found error: {source}")]
    NotFound{source: HTTPError}
    OtherHTTPError(HTTPError),
    OtherDNSError(DNSError),
}

I think that syntax can reasonably be interpreted as:

#[derive(Display)]
#[display(default("My error: {_0}"))]
enum Error {
    #[display("not found error: {source}")]
    NotFound{source: HTTPError}
    OtherHTTPError(HTTPError),
    OtherDNSError(DNSError),
}

Instead of its actual meaning:

#[derive(Display)]
#[display("My error: {_inner}")]
enum Error {
    #[display("not found error: {source}")]
    NotFound{source: HTTPError},
    OtherHTTPError(HTTPError),
    OtherDNSError(DNSError),
}
ilslv commented 1 year ago

@JelteF unfortunately very few people upgrade to alpha/beta versions. They aren't suggested on cargo update and aren't integrated with other tooling. People to upgrade are usually somehow involved and very invested in the project itself, and I think most people are just getting this crate as a transitive dependency and have no idea of its existence. The project also wasn't really active in a past few years, so most of the people brushed it alway as a "stable" version that isn't subject to any major upgrades.

This problem isn't really unique to this crate, I remember even tokio having very low percentage of beta consumers on 1.0 stabilization.