ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
31.96k stars 2.33k forks source link

Remove anonymous struct types from the language #16865

Open mlugg opened 10 months ago

mlugg commented 10 months ago

This proposal is extracted from #16512, with more details and justification.

Background

Zig currently has the concept of an "anonymous struct type". This is a type which comes from an anonymous struct literal (.{ ... }) with no known result type. These types are special: they allow coercions based on structural equivalence which normal structs do not allow. For instance:

const anon = .{ .x = 123 };
const S = struct { x: u32 };
const s: S = anon;
_ = s;

This works because anon has an anonymous struct type, and all of its field types (in this case comptime x: comptime_int = 123) are coercible to those of S, so @TypeOf(anon) coerces to S field-wise. Anonymous struct types also allow even stranger coercions, such as allowing these coercions through pointers by creating new constants (e.g. *const @TypeOf(anon) coerces to *const S).

Justification

I'm not entirely sure why anonymous struct types exist. My guess is that they originated before RLS, as the method for anonymous initializers to initialize concrete types. In that world, the concept makes sense, but today - with RLS - untyped anonymous initializers are virtually never used. Retaining anonymous struct types significantly complicates the language:

To pick up on the last point in particular: the only case where anonymous struct types are really used today is when writing code such as the above example. This kind of code would really benefit from a type annotation: it's unclear what anon is meant to be! Beginners sometimes write this kind of code expecting Zig to use the information from the later lines in type inference (inferring that anon should have type S): but anonymous struct types actually mask the issue here, potentially making code "work" whilst being harder to read, slower, and potentially buggier.

Lastly, time to quantify a statement I made a moment ago:

...untyped anonymous initializers are virtually never used.

I looked at the ZIR for a few random files of real Zig code, and noted the following things:

Source Total Struct Inits Anonymous Struct Inits Would Remain With Better RLS
Sema.zig 1199 4 0
std/array_list.zig 29 1 1 (but removing improves code!)
std/mem.zig 47 1 0
std/Build.zig 38 0 0
Bun: js_parser.zig 814 0 0
Bun: js_ast.zig 278 0 0

You can see from these numbers that untyped inits rarely happen, and when they do, the proposed RLS improvements would eliminate them. Note that if you try, you can find some files which do genuinely use a lot of anonymous inits right now - for instance arch/x86_64/Lower.zig in the compiler has 107 at the time of writing - but as far as I can tell from a quick glance every single one of those would be eliminated by #16512. That proposal can essentially be considered a prerequisite of this one.

Proposal

Eliminate anonymous struct types from the language. Untyped struct initializations are still permitted - they are useful for metaprogramming (e.g. std.Build.dependency's args parameter) - but they return a "standard" struct type, with no extra allowed coercions etc.

There's not much else to say. This is a proposal to remove an unnecessary concept from Zig: simplifying the language, encouraging code readability, and making us less prone to issues such as #16862.

rohlem commented 10 months ago

It introduces flawed coercions

The coercion is only flawed for pointers to those types, for which I agree they should be disallowed.

I also agree that simplifying the language by making them the same as regular struct-s without extra coercions is desirable, since I think the primary use cases should be solvable by a manual copy implemented using @typeInfo-reflection. If that ends up being too hairy / suboptimal, perhaps a builtin @construct taking an anytype initialization expression (similar to @apply taking the argument tuple of a function) would be useful. (I imagine that could also be used to implement @unionInit in userspace - or we keep them as separate @unionInit and @structInit for clarity.)

Jarred-Sumner commented 10 months ago

Would this mean logging with named parameters now needs an explicit type?

Current:

std.debug.print("Hello {[name]s. Welcome to {[project]s}."", .{ .name = "John", .project = "Boop" });

After:

const DoesThisMeanIHaveToWriteThisOutWhenLoggingNow = struct { name: []const u8, project: []const u8 };

std.debug.print("Hello {[name]s. Welcome to {[project]s}.", DoesThisMeanIHaveToWriteThisOutWhenLoggingNow{ .name = "John", .project = "Boop" });

What are the implications for zon, which seems to rely a lot on anonymous types?

mlugg commented 10 months ago

No, it would not, as mentioned in the proposal:

Untyped struct initializations are still permitted - they are useful for metaprogramming (e.g. std.Build.dependency's args parameter)

The exact same logic applies to the args parameter of std.fmt.format. Pretty much the only thing this proposal changes about the language is disallowing certain coercions.

mpfaff commented 10 months ago

Pretty much the only thing this proposal changes about the language is disallowing certain coercions.

The title suggests you are proposing to "remove anonymous struct types" from the language. If that is not an accurate summary of the proposal, should it be changed to reflect that?

mlugg commented 10 months ago

The title suggests you are proposing to "remove anonymous struct types" from the language. If that is not an accurate summary of the proposal, should it be changed to reflect that?

That is an accurate summary. The only user-facing change to the language caused by removing these types is disallowing certain coercions that are currently allowed, because said coercions are the only difference between "standard" and anonymous struct types.

rohlem commented 10 months ago

I think the source of confusion (and what tripped me up initially as well) is that "anonymous struct literal syntax" .{.a = 3} stays, but instead of being of an ad-hoc created unnamed "anonymous struct type" it is now of an ad-hoc created unnamed "regular struct type".

As "anonymous" has (afaik) never been well-defined, it can be misinterpreted to mean ad-hoc created, unnamed, or both - in this case it instead means "special coercion rules" that have no real connection to the abstract concept of anonymity afaict.

An alternative formulation would be "remove special coercions from the type of untyped struct literals" because that's the minimal user-facing impact of the change. IMO we should settle on a name like untyped, deduced-type, ad-hoc, or something not related to identification/naming (as in Zig types are distinct => identifiable, and values including types are unnamed until they're assigned to a named location), and consistently use that word for all similar features (in langref, code, discussion, etc.) to avoid confusion going forward.

VortexCoyote commented 9 months ago

pardon if i have misunderstood, but does this mean that the syntax for assigning variables with a struct literal will disappear as well? will this affect anonymous tuples as well?

so for instance:

const extra_args = .{ 32, "some text" };
std.log.info("{any}, {s}, {any}, {s}", .{ 64, "some other text" } ++ extra_args);

being able to assign variables with anonymous structs/tuples for later parsing/deduction is a very nice feature, as it provides convenient declarative syntax for initializing recursive data structures (like an UI tree) or constructing types. and being able to declare anonymous tuples provides good reusability, as you can concatenate them to other anonymous tuples later on, as shown above.

mlugg commented 9 months ago

As mentioned in both this comment and this statement in the original issue:

Untyped struct initializations are still permitted - they are useful for metaprogramming...

...no, that syntax remains, and the example you give will continue to work.

This change does not affect tuples at all, because tuple types already work on structural equivalence, which is one of their defining features. In essence, all tuple types are already anonymous, and that won't change.

As I said above: the only user-facing language change you will see from anonymous struct types being removed is certain coercions no longer working. These coercions are the defining property of anonymous struct types, and the only difference between them and concrete struct types.

ikskuh commented 9 months ago

Had a bit of a panic, but after the clarification, i'm all in for the change. Luuk explained that feature to me once and its really horrible and i think i only ever used it once with full intent.

So: Yeet that shit, make anonymous struct literals without result type just regular, implicitly declared struct values

andrewrk commented 9 months ago

I have the same question as @Jarred-Sumner - what about this example?

const foo: struct { ... } = @import("foo.zon");
mlugg commented 9 months ago

That's one example that would indeed cease to function without further work - the result of the @import would have a distinct struct type, and hence could not coerce to that struct type.

If we want that to work (which I do think makes sense), we could fix this in the compiler simply by making @import consider a result type. For .zig imports, we don't actually use this result type. However, the ZIR generated for a ZON import could take a result type from external code (which can just be generic poison if no result type is actually given), and construct a value of the relevant struct type. So, like with other anon structs, rather than relying on a deep coercion, we are instead relying on RLS to construct the value correctly in the first place.

mlugg commented 9 months ago

Just to note, I have thought of another case this will break:

const x = if (runtime_condition) Foo{ .x = 123 } else .{ .x = 456 };

PTR will fail on these types, as the anonymous literal cannot coerce to Foo.

I don't think this is really a loss - this code is improved by annotating the type of x instead. I just thought I'd mention it.

notcancername commented 4 months ago

By using @TypeOf and @compileLog, a programmer who is unaware of the distinction between anonymous struct types and regular struct types can be misled into concluding that structs with comptime fields only can coerce to other structs ( Struct with comptime fields refuses to coerce to struct with runtime fields on ziggit):

It seems like structs with only comptime stuct fields can coerce to compatible structs:

@compileLog(@as(Rational(i32), .{.num = 6, .den = 9}));
@compileLog(@TypeOf(.{.num = 6, .den = 9}));
Compile Log Output:
@as(types.Rational(i32), .{.num = 6, .den = 9})
@as(type, struct{comptime num: comptime_int = 6, comptime den: comptime_int = 9})

Would these semantics be desirable? They would make anonymous struct initializations a special case of type coercion again, replacing anonymous structs with a less confusing, metaprogrammable alternative, and making the below examples equivalent again, and resolving some of the concerns expressed above:

const anon_struct_init: struct {foo: u8, bar: u8}  = .{.foo = 1, .bar = 2};
// Equivalent in status quo, wouldn't be in this issue, would again be with these semantics
const anon_struct = .{.foo = 1, .bar = 2};
const anon_struct_init: struct {foo: u8, bar: u8}  = anon_struct;
rohlem commented 4 months ago

@notcancername Are you suggesting special-casing structs which have only comptime fields, or structs which also have comptime fields? Note that status-quo anonymous struct types can have a mixture of comptime and run-time fields, initializers which are run-time-known, such as values from var or depending on run-time branching, result in run-time fields. An anonymous struct type in status-quo doesn't have to hold any comptime fields at all, but they still coerce to other struct types in status-quo.

notcancername commented 4 months ago

Wait, yeah, you're right, nevermind. I made the logical leap from "has comptime fields" to "coerces to structs" without considering init with runtime values. My bad.

billzez commented 3 months ago

So anonymous literals can be abused to create struct types that are structurally typed instead of nominally typed as follows:

const T = struct { x: u32, y: u64 };
var t: T = undefined;
t = t;
const T2 = @TypeOf(.{ .x = t.x, .y = t.y })

No comptime fields are needed. Now I can use non-anonymous initialization and I still get structural typing:

const t2 = T2{ .x = 1, .y = 2 };
t = t2;

Unfortunately @Type(@typeInfo(T2)) converts it back into a nominal type.

As I have code that uses this to create structural types, I'm hoping this doesn't get removed. In fact I would like it to be easier to create structural types:

const T2 = @Type(.{ .Struct = .{
  .fields = std.meta.fields(T),
  .is_structural = true,
}});