ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.6k stars 2.53k forks source link

Proposal: Tuples as semantic sugar for normal structs #7485

Open bb010g opened 3 years ago

bb010g commented 3 years ago

Currently, tuples (a.k.a. anonymous list literals) act like a supertype of both structs & arrays. This subtyping relationship is unique in Zig, and I don't think it maintains the same standard of clarity as the rest of Zig. I propose that the unique tuple type currently in Zig is removed, and instead replaced with a concept of "tuple" structs (think a trait type predicate on struct types), tuple semantic sugar for structs that mirrors the current tuple syntax, and formal extensions to the domains of array operators to cover tuples (tuple structs) as well as arrays.

This proposal is formatted as an outline for implementation. Documentation would likely desire to teach this differently. (Currently, tuples are severly under-documented.)

Proposal

Rationale

Zig already has structs. Structs provide constant-time access to an arbitrary number of fields, which tuples also should provide (but for numerically-indexed fields). Structs are already partially the base for tuples in Zig. Existing structs, with some semantic sugar, can meet all the requirements of tuples too.

If you were to make a std.meta.trait.isTuple function on structs, what should it match? I think a strict definition of tuple is appropriate here—at most a non-integer named .len field typed usize, and at least integer-named fields from .@"0" up to .len. I don't know if fields from .len and up would be okay; they could appear from cases like resizing a tuple with fields to a 0-tuple by assigning zero to .len. Presence of extra fields would be subtle, but requiring their absence would mean that shrinking tuples involves dropping all their higher-indexed fields. (Growing tuples at runtime without changing their type would be impossible due to the immutability of type, but shrinking tuples at runtime without changing their type would be possible if .len could be assigned to a lower value later during comptime.) Let's go with allowing the extra fields for now.

Tuple literal syntax should be semantic sugar for anonymous struct syntax: a tuple of .{"a", "b", "c"} should be equivalent to .{len: 3, @"0": "a", @"1": "b", @"2": "c"}. This is easy to explain, works with existing mental models about structs, and makes the construction of tuples using @Type obvious to those who know how to construct structs.

Tuples should be able to use the originally array-only square brace indexing syntax: .{true, false}[0] == true. What structs shall this syntax work for, though? Say you're constructing a tuple from scratch using @Type, and have something close to var tuple: .{len: usize, @"0": bool, @"1": bool} = undefined;. If only tuple structs are allowed to be square brace indexed, then tuple[0] would be invalid, as tuple.len is undefined. If instead all structs are allowed to be square brace indexed, then tuple[0] = false; works right off the bat, and tuple.len = 2; can be executed later. Additionally, if all structs are allowed in square brace indexing, an easy explanation of the syntax may be presented: "On arrays, foo[0] is the first element of the array foo. On structs, bar[0] is the field @"0" of the struct bar, equivalent to bar.@"0"." Missing indices on structs are treated like normal missing struct fields. Zig remains simple, and we don't have the current situation of tuple[0] and tuple.@"0" both magically resolving to the same (struct) field.

For loops should be able to loop over tuples as well as arrays. An easy way to achieve this is to directly tie the semantics of for loops to the semantics of square brace indexing and .len "field" access syntax. The for loop for (foo) |*pt, ix| { block; } is equivalent to { var ix: usize = 0; while (ix < foo.len) : (ix += 1) { const pt = &foo[ix]; block; } }. I don't know if we want to standardize on that exact semantic sugar, but it certainly makes for a helpful teaching tool. Thanks to our previous extension of square brace indexing to handle tuples, for loops just work with tuples, and we now have a pattern for any future for-loop–able objects.

The first example for tuples in current Zig's documentation is var array: [4]u8 = .{11, 22, 33, 44};. Tuples need to work as constant arrays. If tuples aren't a supertype of arrays, this syntax needs to keep working. Non-empty tuples will coerce to arrays of length .len, for both comptime tuples & non-comptime tuples. Non-comptime tuples attempting to coerce will error, which helps keep this coercion from becoming a footgun. If a tuple stops being comptime, an error will occur instead of potentially unnoticed behavior. Formalizing this tuple-to-array coercion allows tuples to be transformed as structs without any negative impact on their functionality as array constants. Compile-time struct transformers won't need to worry about special cases for tuples (and as per the Zen of Zig, "edge cases matter").

Zig currently features ~undocumented operators~ for tuples, based on array operators. Array concatenation (++) works on tuples, as well as array multiplication (**). These operators are formally extended to tuples, and they return tuple-literal–style tuples, even when given inputs with non-tuple fields.

Prospective changes

Besides preferences on how to (not) extend tuples, this proposal does break with current Zig in a couple of areas, so we can choose what current semantics to preserve.

Tuples containing integer fields above the value of .len

See the rationale section for discussion. I wouldn't be opposed to the extra flexibility, but being able to vary a tuple's .len between 0 and the number of available contiguous integer fields going up from zero is kinda a strange behavior.

The proposal current allows these fields in tuples, due to generally allowing other non-tuple fields. To allow if non-tuple fields have been excluded:

Structs containing both tuple & non-tuple fields are (not) tuples

Zig originally only allowed for the creation of tuples through tuple syntax (anonymous list literal syntax), but with the addition of struct-friendly @Type, tuples can also be created through valued typed with type_info.Struct.is_tuple == true. These current-Zig tuples can be equipped with bonus fields (Godbolt):

const builtin = @import("builtin");
const meta = @import("std").meta;
const Null = @TypeOf(null);

comptime {
    const BonusTupleType = blk: {
        var info: builtin.TypeInfo = @typeInfo(meta.Tuple(&[_]type{u8, u8}));
        const template = @typeInfo(struct {bonus: u8});
        info.Struct.fields = info.Struct.fields ++
            &[_]builtin.TypeInfo.StructField{template.Struct.fields[0]};
        break :blk @Type(info);
    };
    var bonus_tuple: BonusTupleType = undefined;
    bonus_tuple[0] = 0;
    bonus_tuple[1] = 1;
    bonus_tuple.bonus = 2;
    @compileLog(bonus_tuple);
    @compileLog(bonus_tuple[0], bonus_tuple[1]);
    @compileLog(bonus_tuple.bonus);
}
| (struct struct:11:26 constant)
| 0, 1
| 2

Whether this ability is desirable, I don't know. You can't create a tuple like this through tuple literal syntax, and it's not obvious that your tuple is actually a full struct behind the scenes too, due in part to syntax like .{"foo", "bar", .baz = true} being invalid. Current Zig doesn't seem to want tuples to "hide" fields like that. For advanced types that want to hold a tuple, a field containing the tuple would probably work fine.

Also, we run into the behavior of tuples coercing implicitly into arrays. The proposal currently assumes that a struct with .len greater than zero and at least integer fields from @"0" up to .len is a tuple, and should be coerced. Care is taken when no integer fields are present, as we don't want structs not trying to be tuples that decided to name a field .len field coercing like tuples. Only a struct of the form struct { len: usize } will try to coerce, and only comptime .{len: 0} will succeed in coercion (other structs, e.g. .{len: 42}, will fail coercion with a compile-time error). (This protection may be bypassed through explicit casting.) If structs not desiring tuple compatability are likely to have at least fields struct {len: usize, @"0": …}, then this coercion functionality is a footgun, and non-tuple fields should be excluded from tuples.

In the proposal, these "bonus" non-tuple fields are allowed. To fully exclude non-tuple fields:

This isn't covered, but denying non-tuple fields should probably be paired with denying non-tuple declarations as well.

Extra notes

Currently, tuple literals can cause some unique internal compiler errors, different even from their @Type-constructed counterparts (Godbolt):

const builtin = @import("builtin");
const meta = @import("std").meta;
const Null = @TypeOf(null);
fn tryTypeInfo(comptime type_info: builtin.TypeInfo, comptime field_ix: usize) void {
    @compileLog(type_info.Struct.fields[field_ix].field_type);
    @compileLog(type_info.Struct.fields[field_ix].default_value);
}
comptime {
    tryTypeInfo(@typeInfo(struct {@"0": *const Null}), 0);
    tryTypeInfo(@typeInfo(meta.Tuple(&[_]type{*const Null})), 0);
    tryTypeInfo(@typeInfo(@TypeOf(.{&null})), 0);
}
| *const (null)
| null
| *const (null)
| null
| *const (null)
| Assertion failed at /deps/zig/src/stage1/ir.cpp:10667 in const_ptr_pointee. This is a bug in the Zig compiler.
Unable to dump stack trace: debug info stripped
Compiler returned: 255

A simpler language means a simpler compiler.

When I first drafted this, both struct-friendly @Type and an equivalent to std.meta.Tuple were proposed, but those are now in the language! :smile:

ikskuh commented 3 years ago

Just one question: Isn't it the case the tuples are already just structs that have TypeInfo.Struct.is_tuple set to true?

rohlem commented 3 years ago

I'm all for unification. Note that tuples allow ++, ** and are proposed to allow slicing in #4625 . All of these are only specified for indexed fields, so to me it would seem least-surprising to make structs with regular fields be non-tuples.

bb010g commented 3 years ago

Update after some discussion on the Zig Discord guild yesterday (thanks @MasterQ32!):

Given all those restrictions (implemented in this compiler yet or not), I don't think struct tuples are the right idea anymore. Instead, tuples should be comprised of a TypeInfo variant like the following:

pub const TupleField = struct {
    field_type: type,
    default_value: anytype,
    is_comptime: bool,
    alignment: comptime_int,
};

pub const Tuple = struct {
    len: comptime_int,
    fields: []const TupleField,
};

TypeInfo.Struct.is_tuple is removed. For all tuples tup, tup.len == @typeInfo(@TypeOf(tup)).Tuple.len. TypeInfo.TupleField mirrors TypeInfo.StructField as much as necessary, without struct-specific information such as field name included. Tuple fields are stored according to their index.

Named field access is dropped; tup[0] is the only way to access the first element of a tuple now, instead of both tup[0] & tup.@"0".

Proposed syntax struct { u32, []const u8 } was brought up as a prospective replacement for std.meta.Tuple([_]type{ u32, []const u8 }). I'm not sure if keeping the struct naming would be desirable, so I'll use tuple { u32, []const u8 } syntax for tuple types for now instead.

The unit tuple (0-tuple), tuple{}, should have the same type between all instances and be written .{} as an anonymous tuple literal. tuple{} is specially coercible to empty structs in the same way the current anonymous struct literal .{} is. For all other tuples, no coercion to structs occurs and coercion to arrays continues working as it currently works.

In terms of literal syntax, the current syntax should continue to work fine. .{ false } can only be a tuple, as it has no =, and .{ .foo = false } can only be a struct, as it has =. The sole ambiguous case is .{ }, which is solved by always meaning the value inhabiting tuple{} and always being coercible to an empty struct. (If this coercion isn't enough, alternative tuple syntax could be adopted.)

I'm not tied to tuples being structs, but Zig should either embrace tuples being structs or let them be free as their own type. A lot of unnatural rules should be enforced for .is_tuple = true structs compared to the rest of Zig's types, and I think that's a sign that our current design is wrong. Tuples being top-level types similar (but not equivalent) to arrays solves this code smell.

ikskuh commented 3 years ago

I like this (second) proposal a lot.

Important note: We still need anonymous struct literals to allow initialization of structs like this:

var x: X = .{ .x = 0 };

We also need a way to allow fully anonymous literals to allow a improved formatting api:

std.debug.print("{a} {b}", .{
  .a = 10,
  .b = 20,
});
bb010g commented 3 years ago

@MasterQ32 What I'm proposing is you keep anonymous struct literals and anonymous tuple literals the same.

Currently, anonymous tuple literals (anonymous list literals) have syntax without equal signs, and anonymous struct literals have syntax with equal signs:

const s = .{.a = 1, .b = 2, .c = 3};
const t = .{1, 2, 3};
const o = .{};

Does the anonymous struct syntax need to change to make both s & t work? Also, for o (which is ambiguous), if you let that be the single value that inhabits the type tuple{} then you can make only that value coerce to any struct for which the current anonymous struct literal .{} also works. (This is covered in the second proposal.)

htqx commented 1 year ago

var tuple = .{ 0,1, .[2]=2, .[3] = 3, 4}; print("{}\n", .{tuple}); // print .{1,2,3,4}

.[2] => tuple[2] like.

.@"2" != 2 ==> char "2" .@"\x02" != 2 ==> *[1]u8 (2)

So the new .[n] syntax is required

bb010g commented 1 year ago

@htqx Could you elaborate more on what you're trying to say? I don't follow your examples.