ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.74k stars 2.48k forks source link

get rid of the `.` in tuples & anonymous struct literal syntax #5039

Open andrewrk opened 4 years ago

andrewrk commented 4 years ago

I'm not sure if it's possible to do this, but here's an issue at least to document that we tried.

I believe the current problem is that when the parser sees

{ expression

It still doesn't know if this is is a block or tuple. But the next token would either be , making it a tuple, ; making it a block, or } making it a tuple. Maybe that's OK.

Then tuples & anonymous struct literals will be less noisy syntactically and be easier to type.

SpexGuy commented 4 years ago

This would be nicer to type, but the ambiguity with blocks could lead to some unintuitive compile errors. Consider this example:

fn voidFn() void {}

export fn usefulError(foo: bool) void {
    if (foo) {
        voidFn() // forgot semicolon here
    } else {
//  ^ current compile error: expected token ';', found '}' on first token after forgotten semicolon
        voidFn();
    }
}

Here I've forgotten the semicolon on the expression in the if statement, and the compiler gives an error at the first token after the missing semicolon. But with this proposal, the missing semicolon would cause the block to be interpreted as a tuple, equivalent to this code:

fn voidFn() void {}

export fn weirdError(foo: bool) void {
    if (foo) .{
//  ^ error: incompatible types: 'struct:42:15' and 'void'
//  ^ note: type 'struct:42:15' here
        voidFn() // forgot semicolon here
    } else {
//         ^ note: type 'void' here
        voidFn();
    }
}

For a forgotten semicolon, this is a very confusing error to get, especially for someone new to the language who hasn't learned anonymous struct syntax yet. This might be a solvable problem by identifying cases where this parsing is close to ambiguous and improving the error messages that result, but the two interpretations are different enough that it might be difficult to give an error message that is meaningful for both.

SpexGuy commented 4 years ago

Oh, I think there's also an ambiguous case. These two statements are both valid but mean very different things:

fn voidErrorFn() error{MaybeError}!void {}
comptime {
    var x = voidErrorFn() catch .{};
    var y = voidErrorFn() catch {};
}
andrewrk commented 4 years ago

Good point on the ambiguous case, and good explanation in the other comment.

For the ambiguous case I'd be willing to change {} to mean "tuple of 0 values / struct with 0 fields" and find some other way to specify the void value. @as(void, undefined) already works, but is a bit scary. Maybe void{} if #5038 is rejected.

jakwings commented 4 years ago

Mixing blocks and tuples&structs is the very reason I don't like this proposal. I often forgets the brackets .{ } for printf functions but I never forgot the little dot. What new features are blocked by this proposal?

andrewrk commented 4 years ago

What new features are blocked by this proposal?

nothing - this is pure syntax bikeshedding :bicyclist:

SpexGuy commented 4 years ago

I'd be willing to change {} to mean "tuple of 0 values / struct with 0 fields" and find some other way to specify the void value.

Hmm, that would definitely work, but it also might have the side effect of unexpectedly converting

fn voidErrorFn() error{MaybeError}!void {}
export fn foo() void {
    voidErrorFn() catch {
        //commentedOutCall();
    };
}

to

fn voidErrorFn() error{MaybeError}!void {}
export fn foo() void {
    voidErrorFn() catch .{};
}

when the programmer tries to test commenting out that line.

This could potentially be solved, but we would have to go down kind of a strange road. Zig currently makes a distinction for many types of blocks between expressions and statements. The if in x = if (foo) { } else { }; is an expression. As a result, it requires a semicolon. But if (foo) { } else { } is not an expression but a full statement, so it does not require a semicolon. If we changed <expression_evaluating to !void> catch { code block } to a full statement, so it wouldn't require a semicolon, we could then define {} to mean empty block in a statement context and empty struct in an expression context. This would be kind of a weird facet of the language to play into, but it would fix all of the problems I've brought up and replace them with a single weird but consistent thing. In most of the cases I've thought through, this definition can produce a useful error message if you accidentally get the wrong interpretation of {}. But this could still end up being a footgun in comptime code.

mogud commented 4 years ago

Maybe we can change block syntax, #4412

andrewrk commented 4 years ago

OK, I'm convinced to close this in favor of status quo.

ghost commented 4 years ago

Another option - maybe already discussed, I couldn't find any discussion about it yet - is to use [] for the anonymous list literals (and tuples).

jakwings commented 4 years ago

Inspired by ({}) in JavaScript and (0,) in Rust:

Idea No.1:

{}              // empty block -> void
({})            // still empty block -> void
                   because block is also expression instead of statement
{,}             // empty tuple/struct
{a} {a,b} {...} // tuple with N elements
{.a = x, ...}   // struct
{...;}          // block -> void
label: {...}    // block

const a = [_]u8{1, 2, 3};
const a: [3]u8 = {1, 2, 3};
const a = @as([3]u8, {1, 2, 3});
const a = [_]u8{};  // error?
const a = [_]u8{,};
const a = @as([0]u8, {});  // error!
const a = @as([0]u8, {,});

fn multiReturn() anyerror!(A, B) { ... return {a, b}; ... }
Idea No.2:

()              // empty tuple
(,)             // empty tuple
(,a)            // tuple with 1 element
(a,)            // tuple with 1 element
(a, b, ...)     // tuple with N elements

{}              // empty block (asymmetric to tuple syntax)
{,}             // empty struct
{, ...}         // struct
{...}           // block/struct
label: {...}    // block
({})            // still empty block -> void

const a: [_]u8 = (1, 2, 3);
const a = @as([3]u8, (1, 2, 3));

const a: T = {};  // error if T is not void?
const a: T = {,};
const a = @as(T, {});  // error if T is not void?
const a = @as(T, {,});

const a = [_]u8.(1, 2, 3);
const a = T.{};
↑ Rejected :P https://github.com/ziglang/zig/issues/760#issuecomment-430009129

const a = [_]u8{1, 2, 3};
const a = T{};
↑ Don't worry? https://github.com/ziglang/zig/issues/5038

fn multiReturn() anyerror!(A, B) { ... return (a, b); ... }
SpexGuy commented 4 years ago

A really cool but probably bad option is to define the single value of void as the empty struct. This would remove the ambiguity since it doesn't matter if {} is a block or an empty struct, they both evaluate to the value of void! But this could cause a lot of other weirdness, like var x: Struct = voidFn(); causing x to be default-initialized, so it's probably not something we should do.

mogud commented 4 years ago

I suggest to reconsider about this proposal.

As @andrewrk noticed in 0.6.0 release note, 0.7.x maybe the last chance to make a bigger change in zig to polish this language more elegant, intuitive, simpler. There are a lot of proposals related to this:

4294 always require brackets, so it reduces use cases about block and resolves issue #1347. Also related to #5042, #1659 which can remove () used in if/while/for/switch.

4412 new keyword seqblk for labeled blocks, may be block should use this kind of syntax.

4170 shows in order to keep consistency, anonymous funtion literal has a weird syntax.

5038 removes T{}, also related to #4847, array default initialization ayntax.

4661 remove bare catch, related to require brackets and examples below.

IMO we should take all these things into account. Here's the ideally syntax what I prefer to:

// named block
const x = block blk {
    var a = 0;
    break :blk a;
};

block outer while {
    while {
        continue :outer;
    }
}

// unnamed block
block {
    var a = 0;
    assert(a+1 == 1);
}

// switch
const b = switch {
    a > 0 => 1,
    else => 0,
}

const b = switch var a = calc() {
    a == 0 || a == 1 => 0,
    else => 1,
};

switch var a = calc(); a {
    .success => block {
        warn("{}", {a});
    },
    else => @panic(""),
}

// while
var i: usize = 0;
while {
    i += 1;
    if i < 10 { continue; }
    break;
}
assert(i == 10);

var i: usize = 1;
var j: usize = 1;
while i * j < 2000; i *= 2, j *= 3 {
    const my_ij = i * j;
    assert(my_ij < 2000);
}

const x = while var i = 0; i < end ; i += 1 {
    if i == number { break true; }
} else { false }

while var i = 0; getOptions(i) => v; i += 1 {
    sum += v;
} else {
    warn("", {});
}

// for
for seq => v {
    warn("{}", {v});
}

// if
if var a = 0; a != b {
    assert(true);
} else if a == 9 {
    unreachable;
} else {
    unreachable;
}

const a = if b { c } else { d };

const x: [_]u8 = if a => value {
    {value, 1, 2, 3, 4}
} else block blk {
    warn("default init", {});
    break :blk {0, 1, 2, 3, 4};
}

// error
const number = parseU64(str, 10) catch { unreachable };

const number = parseU64(str, 10) catch { 13 };

const number = parseU64(str, 10) catch block blk {
    warn("", {});
    break :blk 13;
};

const number = parseU64(str, 10) catch err switch err {
    else => 13,
};

fn foo(str: []u8) !void {
    const number = parseU64(str, 10) catch err { return err };
}

if parseU64(str, 10) => number {
    doSomethingWithNumber(number);
} else err switch err {
    error.Overflow => block {
        // handle overflow...
    },
    error.InvalidChar => unreachable,
}

errdefer warn("got err", {});

errdefer err if errFormater => f {
    warn("got err: {}", {f(err)});
}

// tuple & anonymous literals
var array: [_:0]u8 = {11, 22, 33, 44};

const mat4x4: [_][_]f32 = {
    { 1.0, 0.0, 0.0, 0.0 },
    { 0.0, 1.0, 0.0, 1.0 },
    { 0.0, 0.0, 1.0, 0.0 },
    { 0.0, 0.0, 0.0, 1.0 },
};

var obj: Object = {
    .x = 13,
    .y = 67,
    .item = { 1001, 1002, 1003 },
    .baseProp = { .hp = 100, .mp = 0 },
};
jakwings commented 4 years ago

@SpexGuy Similar option: syntactically {} --> block/struct/tuple but void value -/-> block/struct/tuple? something else to consider...

fn baz() void {}  // ok

// expect return_type but found anonymous struct/list literal {}
fn baz() {} {}

// foo({}) --> bar is always a struct and never a void value?
// foo(@as(void, undefined)) for the rescue
fn foo(bar: var) var { ... }

// any other place to disallow empty blocks?
printf("{}", {})  // error: Too few arguments

@mogud I haven't read all those issues but the design looks quite messy...

while{}  // instead of while true{} or loop{} (yet another keyword!)

// how about "switch true {...}"?
// it duplicates the function of if/else
const b = switch {
    a > 0 => 1,
    else => 0,
}

// mind-blown by the use of ";"
switch var a = calc(); a {...}
if var a = 0; a != b {...}
while i * j < 2000; i *= 2, j *= 3 {...}
while var i = 0; i < end ; i += 1 {...}

// different use of "=>" from "switch"
if parseU64(str, 10) => number {...}
for seq => v {...}
while var i = 0; getOptions(i) => v; i += 1 {
    sum += v
} else {
    // why no semicolon?
    warn("", {})
}

const x: [_]u8 = if a => value {
    // disallow multiple statements?
    {value, 1, 2, 3, 4}
} else block blk {
    warn("default init", {});
    // can it be just {0,1,2,3,4} (without semicolon)?
    break :blk {0, 1, 2, 3, 4};
}

const x = while var i = 0; i < end ; i += 1 {
    if i == number { break true; }
} else false;  // why no brackets?
mogud commented 4 years ago

@iology sorry for those type mistakes, I've edited the post.

const x: [_]u8 = if a => value {
    // disallow multiple statements?
    //      -> no, this is and only can be a single expression
    {value, 1, 2, 3, 4}
} else block blk {
    warn("default init", {});
    // can it be just {0,1,2,3,4} (without semicolon)?
    //      -> no, only catch/if/else used as expression can have a single expression within `{}`.
    break :blk {0, 1, 2, 3, 4};
}
andrewrk commented 4 years ago

@mogud I truly do appreciate what you're doing here, but I don't think the syntax is going to go that direction.

mogud commented 4 years ago

That's ok, zig itself is more important for me. :)

mlawren commented 3 years ago

As a newcomer to Zig I would also encourage not dropping this issue (syntax of tuples - or anonymous structs?). I have spent many hours being confused by the current syntax (it seems kind of unique to zig, and not in a good way) and trying to find out how to loop over const x = .{1, "string", void} or access specific elements (keep wanting to write x[1]).

andrewrk commented 1 year ago

Let us re-evaluate this in light of #14523.

mlugg commented 1 year ago

My opinion on this is that the status quo syntax should remain. While I'm very glad that ZON as a concept has made it in, I strongly believe the Zig syntax should be tailored first and foremost to the source language, provided it doesn't cause significant usability issues in ZON. In this case, I believe that holds; the leading . doesn't actually make ZON any harder to write or read (the single . keystroke is very insignificant, and in terms of reading at least I visually parse .{ as one thing, so it doesn't hinder me at all). In terms of the language, I find the arguments given previously apply.

Blocks and struct literals are fundamentally quite different, and it doesn't make sense to me to unify them. For instance, in self-hosted, we specifically removed support for .{} to initialize a void value (which was a bug in stage1), which makes sense because they represent quite different things! .{} isn't a "catch all initializer", it's specific to aggregates, so it doesn't really make sense for it to also initialize void values. Going the other way, if we tried to simply get rid of the . from the current syntax, empty blocks and empty initializers both become {}, which in my eyes is even worse because this looks (at least to me) more like a block than an initializer (despite the latter likely being a far more common use).

Another problematic case is singleton tuples. These are used quite frequently, for instance in many uses of std.fmt, but just removing the leading . would make them look very similar to single-statement blocks ({func()} vs {func();}). The difference between these expressions is quite subtle, which in my eyes is a fairly confusing property; you don't know if what you're reading is a block or a tuple until you reach the end of the first expression/statement. It's true that the usage is normally obvious from context, but that only makes it more confusing in any rare cases where it's not.

The only way I would be in support of a syntax change here would be if instead of simply dropping the leading ., we replaced the initializer syntax with something else entirely, such as [ .x = 3, .y = 4 ] (therefore making [] an empty initializer). I'm not sure if there are any problematic cases in the grammar, and I wouldn't explicitly support such a change (I happen to quite like the current .{ } syntax, since it makes sense for me to replace the type name with a single token to represent "infer this"), but that solution feels to me much more favorable than outright removing information that helps both the parser and humans reading code to quickly understand what they're looking at.

kuon commented 1 year ago

I fully agree with @mlugg . In the pas few weeks, I have been "teaching" zig to co-workers which all found the .{ syntax very weird at first, but once they understood that it means current context.{ it "clicked" and it seems a well thought concept. It is in line with switch(foo) { .bar kind of statements and the idea that "the dot with nothing before it" is the "current thing of whatever makes sense" is quite natural.

I mention that because we often have the comment that this syntax is weird and could make the language adoption harder, but it is not weird, it is uncommon, but I really think it is the right syntax.

Also as it was pointed out, there are edge cases like tuples which could turn this into something complicated to implement.

Finally, I'll add that I write quite a lot of elixir, where the map (which are used for struct) syntax is %{...}, and this allows for syntax highlighting and ligature which can both help differentiate them from tuples ({} is tuple in elixir). .{ could someday make it into some ligature nerd font, and it can be highlighted in another color today! (for those who likes colors)

deflock commented 1 year ago

It seems that if you have experience in zig already then everything looks fine for you, but as a newcomer I need to say these random thoughts:

  1. Syntax .{} is ugly aesthetically, not that much as <> for generics tho :)
  2. Dots for struct-fields annoy, too much noise .{ .name = "Sam", .age = 11 }
  3. In JS there is a problem returning object from arrow functions and it's solved by wrapping in ( ... ):

    somefn(() => { name, age }); // doesn't work
    somefn(() => ({ name, age })); // works
  4. In JS there is a shorthand for objects. I remember how weird it looked for me when was introduced, but afterall it's really handy.

    const name = "Sam"; 
    const age = 11; 
    const obj = { name: name, age: age }; 
    const obj2 = { name, age };
  5. [ ... ] literals for array-like structures is also cuter than { ... }

Phew, I'm sure this was already discussed million times but I just needed to say all this somewhere 🤣

shanoaice commented 1 year ago

My humble opinion is that the outermost dot and brackets .{} are fine, but I agree with @deflock's second point. The dot before the outermost bracket should be enough to distinguish anonymoys struct from a block, and the dots before struct fields are redundant.

So1aric commented 1 year ago

agree with @shanoaice .

although might be a little off topic, i wonder if we could get rid of the . in dereferencing and unbox optional as well. for dereferencing (.*), it may lead to misunderstanding (with ** maybe? but it could make clear with spaces).

mlugg commented 1 year ago

I heavily dislike that syntax idea; I happen to find the analogy with field access quite nice, especially for optionals where the payload more-or-less is a field. But more to the point, even without the issue of **, that's ambiguous; is x*-3 multiplying x by the integer literal -3 (x * -3), or is it dereferencing x and subtracting 3 (x* - 3)? Spacing can't clear this one up - both operators are single characters.

RaphiSpoerri commented 1 year ago

A really cool but probably bad option is to define the single value of void as the empty struct. This would remove the ambiguity since it doesn't matter if {} is a block or an empty struct, they both evaluate to the value of void! But this could cause a lot of other weirdness, like var x: Struct = voidFn(); causing x to be default-initialized, so it's probably not something we should do.

I actually like that idea. What if we restrict it to coercion of {} literals to the type void? In other words, {} can coerce to

[_]T
// or
void

However, without any type coercion, the type of {} will default to void. Thus,

stdout.writer().print(
    "No ambiguity?\n",
    {} // coerced to array
) catch {}; // void

fn func(array: [0]u8, x: void) void {
    // …
}
const a: [_]u8 = {}; // coerced to array
const b: void = {}; // void
const c = {}; // void

// fn(array, void)
func(a, b);
func({}, b);
func(a, {});
func({}, {});

switch (a.len) {
    0 => {}, // void
    else => unreachable,
}
RaphiSpoerri commented 1 year ago

Alternatively, the big cases where {} should definitely be parsed as a block are:

I think that covers about 99.9% of the cases. We could make {} a block in these cases, and an empty structure/union/array otherwise.

iacore commented 1 year ago

The current syntax is a lot clearer to read.

OSuwaidi commented 5 months ago

If we represent tuples as having at least one comma {,} (empty tuple, without the . prefix), and {} as an empty block -> void, then that solves both of @SpexGuy initial edge cases.

That representation also aligns with how Python defines its tuples; (1): int while (1,): tuple (for a tuple to be recognized as such, it must contain at least one comma; a collection of comma separated values).

Currently, .{1,} and .{1} are equivalent. The above representation is also consistent and aligns well when letting Zig infer the size of the tuples: [_]u8{1, 2, ..., n} where it's not prefixed with a ..

And @deflock 2nd point is quite valid.

RaphiSpoerri commented 5 months ago

@OSuwaidi
{,} is ugly.

OSuwaidi commented 5 months ago

@RaphiSpoerri Well, at least it clears up both ambiguities, is consistent, and people find .{} confusing/unconventional as well.

I'm sure Andrew initiated this issue in the first place for a reason.

burdiyan commented 4 months ago

As someone who's trying to learn Zig coming from more than a decade of writing Go, I really like Zig as a language for its pragmatism, clarity, and relative simplicity, especially compared to Rust. But as with Rust, I find the syntax and the "ergonomics" to be somewhat weird.

The leading dot thing just hurts my eyes every time I look at it :) I understand the problem, and the reasons for having it, and it's probably something you get used to over time, but it just goes against my personal sense of beauty. I could probably live with .{}, but the leading dot in field names seems redundant, and quite annoying. The most annoying part about the leading dot for a struct literal is when you pass it to a function invocation, like greet(.{.name = "Alice"}), the (.{ part looks especially uncomfortable.

I also dislike the mandatory semicolons, and the mandatory parenthesis for if and other flow control constructs, but I could live with that :) Although it seems like making braces mandatory for flow control blocks could make parens optional?

This is of course very biased, but I wish new languages took more inspirations from Go (more than just gofmt 😁), because over time you get to appreciate nuances that you wouldn't notice by just looking at the language. Of course it has its own quirks and flaws, and lots of them, but in terms of syntax I find it very satisfying to write, and probably more important, to read.

16hournaps commented 4 months ago

As someone who's trying to learn Zig coming from more than a decade of writing Go, I really like Zig as a language for its pragmatism, clarity, and relative simplicity, especially compared to Rust. But as with Rust, I find the syntax and the "ergonomics" to be somewhat weird.

The leading dot thing just hurts my eyes every time I look at it :) I understand the problem, and the reasons for having it, and it's probably something you get used to over time, but it just goes against my personal sense of beauty. I could probably live with .{}, but the leading dot in field names seems redundant, and quite annoying.

I also dislike the mandatory semicolons, and the mandatory parenthesis for if and other flow control constructs, but I could live with that :) Although it seems like making braces mandatory after flow control constructs could let you make parens optional?

This is of course very biased, but I wish new languages took more inspirations from Go (more than just gofmt 😁), because over time you get to appreciate nuances that you wouldn't notice by just looking at the language. Of course it has its own quirks and flaws, and lots of them, but in terms of syntax I find it very satisfying to read and write.

It is very understandable and I held same opinion as you. It goes away after 10kloc. Afrer 10kloc of zig you will go to github issues and write the same reply which I am writting right now, defending the syntax. Though maybe same will happen after they remove it :D