ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.93k stars 2.48k forks source link

Proposal: allow coercions between optionals and error unions with compatible payloads #16765

Open mlugg opened 1 year ago

mlugg commented 1 year ago

Motivation

Consider the following code:

test {
    _ = f();
}
fn f() ?u32 {
    return g();
}
fn g() ?u8 {
    return 123;
}

This currently fails with a compile error:

[mlugg@polaris test]$ zig test foo.zig
foo.zig:5:13: error: expected type '?u32', found '?u8'
    return g();
           ~^~
foo.zig:5:13: note: optional type child 'u8' cannot cast into optional type child 'u32'
foo.zig:4:8: note: function return type declared here
fn f() ?u32 {
       ^~~~

The error here is a little misleading, since clearly u8 can cast into u32. However, we currently require child types of optionals to be in-memory coercible. That effectively means that the bytes of the old type can be reinterpreted as the new type, with both types having the same size: for instance, u32 is IMC to c_uint (on platforms where C int is 32 bits), and *[n]T is in-memory coercible to [*]T. This restriction slightly simplifies the compiler implementation, but it leads to this confusing error message.

The workaround for this is to use orelse to unwrap the optional and re-wrap it, making the body of f read return g() orelse null. However, this seems redundant, and may be confusing for users ("isn't x orelse null always the same as x?").

The same issue applies to error unions, where the workaround is eu catch |e| e, or - in a return statement - return try eu.

Proposal

Allow the following coercions:

The only real downside I see to this is that it introduces a kind of ambiguity around nested optionals / error unions: how does ?T coerce to ??T? In status quo, we effectively get .{ .payload = x }, but the new coercions would also make it theoretically valid for the language to instead unwrap the optional, coerce T to ?T, then re-wrap, so null gives @as(??T, null) rather than @as(??T, @as(?T, null)).

However, I don't think this is a major issue. The status quo behavior for this coercion seems fairly clearly better, and I believe is what most users would expect, so we can simply define the coercions to retain this behavior, by prioritising the "coerce value to child type" method. Nested optionals / error unions are fairly rare to come across (note that this does not impact types of the form E!?T), so I highly doubt this will cause any real confusion.

In fact, we already have a case like this - I used it two paragraphs up! @as(??T, null) always makes the outermost optional null, even though it could be argued that it'd be valid for it to make the outer optional a payload with the inner value being null. This doesn't cause a significant amount of confusion, so I don't see a reason this proposal would either.

bgourlie commented 9 months ago

Would this also apply to tagged union coercions? Consider the following:

const std = @import("std");

const FooTag = enum {
    bar,
    baz,
};

const Foo = union(FooTag) {
    bar,
    baz: usize,
};

test "Would this work?" {
    const foo: ?Foo = Foo{ .baz = 100 };
    try std.testing.expectEqual(@as(?FooTag, FooTag.baz), foo);
}
mlugg commented 9 months ago

Yes: since @as(FooTag, .baz) can coerce to Foo, this proposal would make it so that @as(?FooTag, .baz) could coerce to ?Foo.

Fri3dNstuff commented 2 months ago

To expand on this proposal, why not add the ability to coerce the child type for all of the (non-remote) container types in Zig? So, in addition to optionals and error unions, implement the ability on arrays as well.

This seems to already work with vector types, so it's not that far fetched...

// this works, apparently
var a: @Vector(10, u16) = undefined;
var b: @Vector(10, i32) = a;
_ = .{ &a, &b };

My reasoning in favour of expanding the proposal for arrays is mainly for simplicity; if coercion also works inside of arrays the stated rule can be "for all types T and U where T coerces to U, a container C T coerces to a container C U".

Otherwise I feel the coercion rules would be needlessly complicated: "for all types T and U where T coerces to U, a container C T coerces to a container C U. unless that container is an array, then it doesn't work. unless unless the child type is in-memory-coercible, then it works again, because... reasons..."

The only downside I (currently) see to this is that this kind of coercion may be somewhat expensive if the array is long, but in that case I feel like vector coercion should be reconsidered too.