Peternator7 / strum

A small rust library for adding custom derives to enums
https://crates.io/crates/strum
MIT License
1.8k stars 152 forks source link

Add `EnumDeref` trait and derive #315

Closed Syndelis closed 10 months ago

Syndelis commented 1 year ago

About this PR

This PR introduces a new trait and derive EnumDeref for automatically implementing Deref and DerefMut in enumerators composed of new-type-style variants.

Motivation

The primary motivation for this feature is included as one of the tests and doc-tests: "tagged boxes". With this trait, it's possible to avoid the usage of Box<dyn Trait> for grouping structs based on behavior. The advantages of using an enum instead of the former are clear as day:

Peternator7 commented 10 months ago

Hey @Syndelis, I'm working through the backlog of PR's. I see the benefit of using enum's instead of boxes, but there is already a crate that does this (albeit slightly differently) that's fairly popular, enum_dispatch so I'm not sure the benefit of putting something similar into strum

Syndelis commented 10 months ago

Hey @Syndelis, I'm working through the backlog of PR's. I see the benefit of using enum's instead of boxes, but there is already a crate that does this (albeit slightly differently) that's fairly popular, enum_dispatch so I'm not sure the benefit of putting something similar into strum

Maybe I didn't give enough time to try enum_dispatch, but I couldn't reproduce any of the tests implemented in this PR with that crate. For example, could you figure out how to re-write this test?

image

It feels like this crate assumes you not only "own" the enum implementation, but also the trait declaration.

Peternator7 commented 10 months ago

I believe you would implement Walker like this:

#[enum_dispatch]
enum AntiBox {
    Person,
    Ant,
}

#[enum_dispatch(AntiBox)]
trait Walker {
    fn distance_walked(&self) -> f32;
    fn walk(&mut self);
}

let a: AntiBox = Person::new().into();

a.distance_walked(); 

I don't think it supports Deref Coercions. Since deref's are typically for pointer types anyway (of which there are few), manually implementing Deref for the occasional type that needs it seems okay to me. I don't think enum_dispatch supports implementing methods on traits from other crates at the moment, but maybe that could be a feature addition to that crate.

Syndelis commented 10 months ago

I don't think it supports Deref Coercions. [...]. I don't think enum_dispatch supports implementing methods on traits from other crates at the moment, but maybe that could be a feature addition to that crate.

So it seems like the use cases of enum_dispatch and what's implemented in this PR don't overlap, isn't it?

manually implementing Deref for the occasional type that needs it seems okay to me

I understand it's not a feature for everyone, but if, in order to group 3 structs by behavior, I was given the option to either manually implement Deref and DerefMut 3 times or use a macro, I'd definitely opt for using the macro

Peternator7 commented 10 months ago

I guess that's what I'm trying to get at. IMo, if you're trying to group structs by behavior, the other crate is probably a better approach. It feels cleaner to implement a trait directly on the enum rather than Deref to a dyn type.

Syndelis commented 10 months ago

I guess that's what I'm trying to get at. IMo, if you're trying to group structs by behavior, the other crate is probably a better approach. It feels cleaner to implement a trait directly on the enum rather than Deref to a dyn type.

It is cleaner indeed, but only if you have access to the trait's declaration. If the trait is foreign, you would maybe need to make a supertrait + blanket trait impl, and that honestly sounds worse than what we have with this PR, wouldn't you agree?

Peternator7 commented 10 months ago

I don't think that's the true alternative, if you don't have access to the trait declaration, your options are:

  1. Manually implement the trait on your enum.
  2. Use this macro to implement a deref coercion which is not the same thing semantically as 1, but will function about the same for most code.

Ignoring the manual vs procedural implementation aspect, I think most of the community would say that the proper way to handle this is to implement the trait on the enum rather than to deref to the variants as a dyn Trait. I think there are situations where that might be the best solution, but my chief concern is that this makes it easier for people to do the wrong thing, and I want to avoid that. Then there's also the stylistic question of is this really best done with Deref or would an inherent method be better for a readability perspective?

my_enum.as_walker() -> &dyn Walker

Basically, there might be value in this feature in the rust ecosystem, but I don't think strum is the right home for it.

Syndelis commented 10 months ago

I understand I'm not gonna be able to convince you to merge this since you've already deemed it "the wrong thing" while seemingly agreeing that "there are situations where that might be the best solution", but I wanted to address some of the points you've made regardless.

I think most of the community would say that the proper way to handle this is to implement the trait on the enum rather than to deref to the variants as a dyn Trait

I'm finding it hard to see how that would be any different from the user's perspective? Wouldn't you be able to call my_enum.method() regardless? And besides, both solutions would need to match on all variants eitherway to only then call in the method. Basically, this:

match my_enum {
  Var1(v1) => v1.method(),
  Var2(v2) => v2.method(),
}

Is equivalent to:

match my_enum {
  Var1(v1) => *v1,
  Var2(v2) => *v2,
}.method()

So the only real difference is the amount of code this macro hides from the user, is it not? In fact, it may be impractical to manually implement the trait on the enum if the number of methods is significative. Just imagine a trait with 20 methods: do you think writing the same match 20 times is the better solution here?

Then there's also the stylistic question of is this really best done with Deref or would an inherent method be better for a readability perspective? my_enum.as_walker() -> &dyn Walker

Would you have merged this PR if, instead of implementing Deref, it created and implemented a pseudo-randomly-named trait that included the method fn as_<deref-target-name-normalized>(&self) -> <deref-target-name>? Would that be better than using the standard library's trait?

Ignoring the manual vs procedural implementation aspect [...]

Well, then you're ignoring the whole feature. As I understand, up to this point you've suggested another crate that does not cover the same use cases that this macro does and now suggested that the workaround should be manually implementing what arguably might as well be an impossible task (depending on the number of methods). For a crate whose sole purpose is to provide macros to prevent the user from having to manually implement a lot of stuff on enums, it feels weird to bring up this as an argument against merging a PR.

I think you're unfairly judging the usefulness of this implementation because you haven't had to deal with the problem it solves. Believe me, I didn't make this simply because I had a spare sunday; I had a problem that I couldn't solve with whatever tools were available.

Peternator7 commented 10 months ago

Hey @Syndelis, maybe I could've been clearer about how they're different. Here's a rust-playground link that shows how rustc generates different code between the 2 approaches:

enum FavoriteNumberWrapper {
    U(usize),
    S(String),
    B(bool)
}

#[inline(never)]
fn do_work_dyn(f: &FavoriteNumberWrapper) -> usize {
    # Deref Approach
    f.as_favorite_number().lookup_favorite_number()
}

#[inline(never)]
fn do_work_impl(f: &FavoriteNumberWrapper) -> usize {
    # Match Arms / Direct Impl
    f.lookup_favorite_number()
}

And the generated assembly copied out of rust playground. For each function, there's 3 branches, one for each variant as expected. The difference is that if you deref/upcast to a dyn object, the processor needs to spend time looking up the vtable before jumping again to the function body for the specific type. The version that calls the function on each variant directly is easier for the inliner to understand and as a result, each branch gets inlined which will generally be faster.

playground::do_work_dyn:
    movzx   eax, byte ptr [rdi]
    test    rax, rax // Check if it's a FavoriteNumberWrapper::U
    je  .LBB11_5
    cmp eax, 1
    jne .LBB11_3
    add rdi, 8
    lea rax, [rip + .L__unnamed_2] // load vtable
    jmp qword ptr [rax + 24]   // tail call

.LBB11_5:
    add rdi, 8
    lea rax, [rip + .L__unnamed_3] // load vtable
    jmp qword ptr [rax + 24] // tail call

.LBB11_3:
    inc rdi
    lea rax, [rip + .L__unnamed_4] // load vtable
    jmp qword ptr [rax + 24] // tail call

playground::do_work_impl:
    movzx   eax, byte ptr [rdi]
    test    rax, rax 
    je  .LBB12_5 // Handle FavoriteNumberWrapper::U
    cmp eax, 1
    jne .LBB12_3 // Handle FavoriteNumberWrapper::B
    mov rax, qword ptr [rdi + 24] // Handle FavoriteNumberWrapper::S
    ret // No tail call because code was inlined

.LBB12_5:
    mov rax, qword ptr [rdi + 8] 
    ret // No tail call, function was inlined

.LBB12_3:
    movzx   eax, byte ptr [rdi + 1]
    ret // No tail call, function was inlined

I appreciate the discussion, and I'm sorry it doesn't look like we're going to reach a consensus. Because this crate is reasonably popular in the community, I do think it's important to steer people towards what I believe is broadly the best practice. That's not to say using dyn dispatch is bad or that a macro that generates it doesn't solve a real problem. Clearly it did for you, but I don't think it's a good fit here like I said above. I would of course encourage you to publish it as it's own crate with an explainer on the problems it can solve for people.

Syndelis commented 10 months ago

I appreciate your time for making this example and acknowledge that performance will be better when not using dyn. I want to point out, however, that your example features a trait with only a single method, which of course isn't a use case for this macro; it's plain easier to just implement the trait directly into the enum than add a whole dependency just for that.

I'd ask you, instead, to try an example with a reasonable amount of methods for a trait, say, 5. Hopefully you may agree with me that the readability and maintainability trade-offs of such structure aren't worth the performance improvements. After all, we can always make compilers optimize better (such as in your example with do_work_dyn) but code can only be as readable as we write it.

For example, this:

#[derive(EnumDeref)]
#[strum_deref_target(MyTrait)]
enum MyEnum {
  Var1(Var1),
  Var2(Var2),
  Var3(Var3),
}

Should be better for maintainability over this:

enum MyEnum {
  Var1(Var1),
  Var2(Var2),
  Var3(Var3),
}

impl MyTrait for MyEnum {
  fn method_1(&self) {
    match self { // Code omitted
    }
  }
  fn method_2(&self) {
    match self { // Code omitted
    }
  }
  ..
}

Not to mention that new methods added in to MyTrait require modifications to the enum's implementation.

I'm not dismissing the performance case here, but I think it doesn't apply to the same class of problems. If you have a trait with one or two methods it's definitely the better approach, but I think it becomes unsustainable rather quickly past this point.

Peternator7 commented 10 months ago

I'm going to close this PR because I don't think strum is the right home for it. Clearly, this solution has worked very well for you and that's awesome, but it's not something I would use, and I ultimately have to maintain the code so I can't accept it.

Having said that, if you do decide to publish a crate with this feature, I'd be happy to link it in strum's README to improve discoverability. I'm sure there are others in the community who would benefit from this package based on how much it's benefitted your dev productivity.

Syndelis commented 10 months ago

I appreciate your time regardless