ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.6k stars 2.53k forks source link

Disallow omitting unused capture value in `switch` if not `void` #14137

Open wooster0 opened 1 year ago

wooster0 commented 1 year ago

I'm sorry if this has already been discussed somewhere but I couldn't find any discussion on this.

Let's say I have this code:

const Value = union(enum) {
    tree: u8,
    diamond,
    house: u32,
};

pub fn main() void {
    const value = Value{ .tree = 5 };
    switch (value) {
        .tree => |height| {
            _ = height;
        },
        .diamond => {},
        .house => |size| {
            _ = size;
        },
    }
}

But now I want to add a payload to diamond by changing the payload type to a non-void one. The compiler will not complain at all. I think at least .diamond => |_| {}, should be mandatory if the payload is not {}. It seems very unlike the way Zig normally is. I just had a case where I added a payload to a tag a while ago, did something else, and then finally realized that my code is all wrong now because it doesn't handle the new payload.

Advantages:

  1. It allows for more compiler-driven development.
  2. It catches mistakes (the primary use case of errors for unused things).
  3. It gives you a chance to double-check other related spots, too.
  4. Explicitness.
  5. Might be easier to read.
  6. Makes it easier to distinguish from other tagged unions with the same tag names, in a switch.
  7. Makes refactoring easier and less error-prone. Normally Zig is good at fearless refactoring.

Also, as always, --autofix could fix this automatically and would do away with any annoyance of this.

lamersclegacy commented 1 year ago

Honestly although it’s a little extra typing, I do like the explicitness; not a fan of .diamond => |_| {} though.

mnemnion commented 1 week ago

One potential issue I see with this idea is the case of several "do nothing" cases living in one prong, one of which carries a payload, or is modified to carry a payload.

Requiring a discard creates a result type, and void and whatever the payload is aren't compatible types. That would force all the ignored payloads off into their own branches, or one per result type at least, and it seems to me this would obscure the logic of the switch.

Example:

const HeteroUnion = union(enum(u8)) {
    fee,
    fie,
    foe: u32,
    fum: f32,
};

fn illegalHeteroSwitch(ht: HeteroUnion) void {
    switch (ht) {
        .fee, .fie => std.debug.print("void types\n", .{}),
        .foe, .fum => |_| std.debug.print("number types\n", .{}),
    }
}

fn legalHeteroSwitch(ht: HeteroUnion) void {
    switch (ht) {
        .fee, .fie => std.debug.print("void types\n", .{}),
        .foe, .fum => std.debug.print("number types\n", .{}),
    }
}

If you call illegalHeteroSwitch it won't compile, because the discard can't produce a compatible type. So there's a good reason not to require discards of payloads.

It's common for a switch to be interested in a limited subset of union types, and it's good practice when that's true to still use exhaustive switching, to force a decision about whether any newly-added enum is or is not of interest. Requiring type-specific discards would make those switches visually dominated by the discarded cases.

Another thing this points at is that sometimes a switch is only interested in the enum category, like in the above case, we want to separate void types from number types. It's not a great example for this point because in real code the categories are semantic more than they are type-based.

In that case the mandatory discard is actively getting in the way, because it would simply be impossible to have a switch prong of heterogeneous type. We could explicitly pull off the tag and switch on that, but now it's not possible to have any prongs which do want to take a payload, the prong would have to reference the original union type and get the value that way.

All of that is feasible but it would lead to a worse experience, and it subtly degrades the balance between a tagged union and its tagging type: one can no longer switch on a tagged union as-if it's the enum, and that's often a useful thing to be able to do.