ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.84k stars 2.55k forks source link

syntax flaw: return type #760

Closed andrewrk closed 6 years ago

andrewrk commented 6 years ago

Fundamentally we have this ambiguity:

fn foo() A { }

Is this a function that returns A with an empty body? Or is this a function that returns the struct literal A{} and we're about to parse the body?

Another example

fn foo() error {}

Function that returns an error, or empty error set, and we're about to find the function body?

fn foo() error{}!void {
}

The above is not allowed because zig thinks the return value is error and the function has an empty body, and then is surprised by the !. So we have to do this:

fn foo() (error{}!void) {
}

Unfortunate. And the compile errors for forgetting to put a return type on a function point at the wrong place.

Let's come up with a better syntax for return types. All of these issues are potentially on the table:

The syntax change should be satisfying, let's make this the last time everybody has to update all their return type code.

Hejsil commented 6 years ago

I don't think we can have that example fail as simple cases like fn a() A.B {} and fn a() A().B {} should work.

My plan was to work on #208, but it seems we still haven't reached a conclusion on this (and this issue seemed to be related, so I switched my focus to this).

I'll probably work towards having the grammar up to date, as that will help with the stage2 parser rewrite when that is gonna happen. Then add some kind of test that ensures that it stays up to date (have the grammar as a bison parser, and run this parser on all Zig src code in this repo on every commit). This will also help when prototyping new syntax.

binary132 commented 6 years ago

@Hejsil, I really like this proposal (https://github.com/ziglang/zig/issues/760#issuecomment-430938743) a lot better than the one using dot to disambiguate. 😁

I don't love the anyerror name but I'm not quite clear on the semantics so I hesitate to comment. It seems kind of abstract and poorly-defined though. This is where Go uses a builtin interface type error that can easily be implemented by any type. Nested / chained errors are also a really common need and all of Go, Rust, C, and C++ have awkward tooling and conventions around that problem. Using a special name for a special error type to imply that it can contain other error values (?) just feels icky and non-obvious.

Is there a place for discussing the error naming or semantics? Please feel free to disregard, just my two cents as an outside observer. But maybe there is an even better solution here.

Hejsil commented 6 years ago

@binary132 docs and error set issue :)

allochi commented 6 years ago

@Hejsil thanks a lot for being so communicative and agile toward this issue.

I'm glad that we are getting to agree on a solution for this, I don't mind parentheses in if, for, ..., I admit I got used not to use them in Go for example, but I don't mind them in other languages, and since these are expressions in zig it kind of works well, take for example this from the standard library

const n = if (self.pos + bytes.len <= self.slice.len)
    bytes.len
  else
    self.slice.len - self.pos;

// -- vs --

const n = if self.pos + bytes.len <= self.slice.len
    bytes.len
  else
    self.slice.len - self.pos;

I don't know about the others, but with parentheses it seems to be easier to read.

Now regarding the global error set, why name it anyerror? why not errorset or errors wouldn't that be more descriptive of what it is?

Hejsil commented 6 years ago

@allochi There are a few reasons for anyerror:

// a could return errors or u8. (I could read fn a() !u8 the same way) fn a() errors!u8 { ... }

// a could return all errors or u8. (It could return all errors at the same time?) fn a() allerrors!u8 { ... }

allochi commented 6 years ago

@Hejsil Thanks, no strong feeling againstanyerror, it seems to be fine.

ghost commented 6 years ago

error{A,B} is an error set, so I don't think we should rename the error to errorset as that is just confusing. I mean I don't know about that.

const A = struct {}; // a struct const A = errorset {}; // an error set

Hejsil commented 6 years ago

@UniqueID1 Sorry, that was a confusing statement. I'm talking about this rename:

fn a() error{}!u8 {} -> fn a() error{}!u8 {}  
fn a() error!u8 {}   -> fn a() anyerror!u8 {}

We could even rename both examples:

fn a() error{}!u8 {} -> fn a() errorset{}!u8 {}  
fn a() error!u8 {}   -> fn a() anyerror!u8 {}

But that's not the goal of this issue.

Hejsil commented 6 years ago

Or maybe best:


// Leave `error` as is. This allows error.A to work without parser hacks
fn a() error { return error.A; }

// Rename `error` for error sets
const E = errorset {A, B};
allochi commented 6 years ago

@Hejsil Yeah, this is what I was thinking of too, it make sense this way.

ghost commented 6 years ago

That was my proposal.

For shorter typing you could have "fn foo() error {}" and "const FileOpenError = errorset {};"

Edit: ..maybe, I don't know. I'm confused haha.

Manuzor commented 6 years ago

Let me get this straight. The new keyword errorset would be used to declare a new error set, just like struct and friends. The existing keyword error would be used to return a specific error, like return error.OutOfMemory.

If that is correct, then I think there is no need for anyerror anymore.

// no errors from this function.
fn foo() void {
    // ...
}

// inferred errorset from this function.
fn bar() !void {
    // ...
}

// explicit errorset that is declared inline.
fn baz() errorset{ OutOfMemory, FileNotFOund }!void {
    // ...
}

// new explicit errorset
const TheErrors = errorset{ OutOfMemory, FileNotFound };
fn qux() TheErrors!void {
    // ...
}

// status quo to return "any error" still works.
fn corge() error!void {
    // ...
}

// Not sure about this one, but it's certainly not ambiguous.
fn waldo() error.OutOfMemory!void {
    // ...
}
ghost commented 6 years ago

I believe that there are 3 options

// keywords are error and errorset
fn foo() error {}
const E = errorset {};
// keywords are anyerror and error
fn foo() anyerror {}
const E = error {};
// just the error keyword
fn foo() error {}
fn foo() (error {A,B}) { ... }
const E = error {};

I prefer error + errorset over anyerror + error. However, with functions accepting TypeExpr instead of Expr, I believe we could also just have the error keyword and use parenthesis in some cases as shown above.

Edit: I see that @andrewrk finds the parenthesis unfortunate (first post, 4th example). My bad Hejsil, forgot about that. In that case having two keywords might be better, and it might be clearer overall anyways. Personally I love error + errorset. Whatever the case, Hejsil's change with functions accepting TypeExpr is absolutely excellent. Super cool change!

binary132 commented 6 years ago

I agree with @UniqueID1. If there's an ambiguity in parsing inline return types in fn signatures, the right way to resolve the ambiguity is by explicit grouping. ISTM this would only be necessary when declaring a type inline in a fn signature's return type, and it makes it more readable, so the increased write-time effort is minimal compared to the benefits.

a) It does not reducing readability overall, and in fact enhances it. Otherwise, fn signatures can read like a run-on sentence. Easy for humans to parse from left to right. b) Does not require an awkward new special-purposed builtin keyword. c) Familiar and known to be usable in Go.

Hejsil commented 5 years ago

The . syntax has been reverted as of #1685, which implements option 2 in this comment