Consider reimplement it with a tagged-union version

js2xxx commented 8 months ago

The version of dedicated tagged unions can benefit more than the current version, which uses Box<dyn Any>. Here's the comparison:

Pros:

Doesn't need dynamic allocation & dispatching;
Requires looser lifetime bounds to the generics;
Allows deriving common traits (Clone, (Partial)Eq, etc.) easily;
Allows multiple instances with the same type and different tags (e.g. Sum<(u32, String, u32)>).

Con(s):

Requires unsafe code around unions & may cause soundness problems.

I wrote a tagged-union version of this sum type. Check it out here if you're interested. I would also be happy to create a PR and reimplement this crate if needed, although it results in massive code diffs.

spacejam commented 8 months ago

Hey :) I started implementing a tagged-union yesterday, but then realized it might be possible to actually do this with actual enums and maaaaybe avoid some unsafe, but I'm less sure about that, where each TypeSet has acorresponding TypeSet::Enum that may be something like E2<A, B>{ A(A), B(B) } with the same arity as the TypeSet. I'm still working through some of the narrow/broaden transformation steps, and I'm not completely sure it's possible yet, and I think it would still require a 'static bound. But in both the enum and tagged union approaches, narrowing and broadening are no longer strictly type-level operations because the size on the stack may need to change, undermining some of the benefits of avoiding boxing. To be honest I'm skeptical that there are very many workloads that really justify the additional complexity on the cold path. But it's a fun problem to reason about!

spacejam commented 8 months ago

One of the reasons I'm really curious about the enum-based approach is because if it actually works, users could pull a subset of locally-addressable concerns out of a OneOf and then use normal rust pattern matching on the result. I still don't know if it will be possible but the idea of an API like this feels pretty compelling to me:

let one_of_3: OneOf<(Local1, Local2, Global)> = OneOf(Local2::default());

// assumes fn returns Result<_, OneOf<(Global,)>>
let local_concerns: OneOf<(Local1, Local2)> = one_of_3.narrow()?; 

match local_concerns.enum() {
    E2::A(Local1) => {},
    E2::B(Local2) => {},
}

js2xxx commented 8 months ago

I suspect the enum repr version would cause the narrowing and broadening operations to cost a lot.

Consider a specialized case of narrowing:

fn narrow<A, B, C, D, E>(repr: E5<A, B, C>) -> Result<C, E4<A, B, D, E>> {
    match repr {
        // 1. Matched.
        E3::C(c) => Ok(c),

        // 2. Unmatched, tag unchanged.
        E3::A(a) => Err(E2::A(a)),
        E3::B(b) => Err(E2::B(b)),

        // 3. Unmatched, tag changed.
        E3::D(d) => Err(E2::C(d)),
        E3::E(e) => Err(E2::D(e)),
    }
}

In this case, the pattern matching is forced to be expanded to 5 arms, while the conditions can be classified into 3 groups (commented above), which in the tagged union case can be reduced to 3 comparisons.

I don't know whether the compiler is smart enough to eliminate those redundant branches, but I think we cannot rely on that behavior since it lies in the scope of optimization.

By the way, the conditions can be further reduced to 2 groups with the Box<dyn Any> version, since the tag equals its type ID in that case, unnecessary to be changed, which seems to have the best performance.

EDIT: Here is my implementation of the enum version. It has fewer code lines but involves more macros and complex generic manipulations compared to the tagged union version.

komora-io / terrors

Consider reimplement it with a tagged-union version #3