ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
32.13k stars 2.35k forks source link

Proposal: Anonymous function literals with function signature inference #4170

Open Rocknest opened 4 years ago

Rocknest commented 4 years ago

Rationale

In the discussion of '#1717 function expressions' there are a lot of comments that touch a topic of function signature inference: https://github.com/ziglang/zig/issues/1717#issuecomment-444200663 https://github.com/ziglang/zig/issues/1717#issuecomment-514536941 https://github.com/ziglang/zig/issues/1717#issuecomment-565284947 https://github.com/ziglang/zig/issues/1717#issuecomment-573472295. Which is a natural use case for anonymous function expressions (classis example is an argument to 'sort'), however as stated by @hryx https://github.com/ziglang/zig/issues/1717#issuecomment-573535638 this is not a goal of that proposal, so this use case will have to be solved anyway, so thats why i'm creating this proposal.

This proposal is compatible with #1717 (which is 'accepted' at the time of writing) but in the substance makes it a bit redundant leaving the controversial [1] [2] 'syntactic consistency among all statements which bind something to an identifier' as the main difference.

Closures are non-goal.

The proposal

Add an ability to declare a function inside of an expression. The types of the arguments of the function should be infered from the context, the same applies to the return type.

Possible syntax:

.|a, b| {
    // function body
}

An opening dot is a convention established by anon enum literals, structs, etc. Parameters enclosed by | | instead of parentheses is also present in the language (eg. loops).

Such expressions should be coercable to the function type that is expected in the declaring scope.

const f: fn (i32) bool = .|a| {
    return (a < 4);
};

var f2: fn (i32) bool = if (condition) .|x| {
    return (x < 4);
} else .|x| {
    return (x == 54);
};

  Ambiguous expressions should probably be compile errors:

const a = .|| { return error.Unlucky; };
// @TypeOf(a) == fn () !void

const lessThan = .|a, b| {
    return a < b;
};
// error: ambiguous
// OR
// @TypeOf(lessThan) == fn (var, var) bool

fn foo() void {
    const incr = .|x| {
        return x + 1;
    };

    warn("woah {}\n", .{ incr(4) }); // Ok?
};

  Some examples:

pub fn sort(comptime T: type, arr: []T, f: fn (T, T) bool) {
    // ...
};

pub fn main() void {
    var letters = []u8 {'g', 'e', 'r', 'm', 'a', 'n', 'i', 'u', 'm'};

    sort(u8, letters, .|a, b| {
        return a < b;
    });
};
(.|| {
    std.debug.warn("almost js", .{});
    // what will @This() return?
})();

Expression expressions

Naming these gets complicated:

sort(u8, letters, .|a, b| => (a < b));

Basically a shortcut for one line function that return some expression. These are not part of this proposal.

iacore commented 2 years ago

We should make the syntax easy to use for refactoring code. Inspiration: https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md#code-refactoring

Step 1

/// inline refactoring

const std = @import("std");

const V = struct {
    i: i32,
    u: i32,
};

pub fn main() !void {
    const v = V{.i=42, .u=35};
    std.log.debug("{}", .{v.i});
}

Step 2

pub fn main() !void {
    const v = V{.i=42, .u=35};
    {
        std.log.debug("{}", .{v.i});
    }
}

Step 3 (proposed syntax for block with restricted access to outer scope)

pub fn main() !void {
    const v = V{.i=42, .u=35};
    bind (v) {
        std.log.debug("{}", .{v.i});
    } ();
}

Not sure how this syntax should be. One orthogonal syntax I can think of (refer to #10458) is too complicated.

Step 4 (call inline anonymous function)

pub fn main() !void {
    const v = V{.i=42, .u=35};
    fn (v: V) void {
        std.log.debug("{}", .{v.i});
    } (v);
}

Step 5 (extract function)

const foo = fn (v: V) void {
    std.log.debug("{}", .{v.i});
};

pub fn main() !void {
    const v = V{.i=42, .u=35};
    foo(v);
}
iacore commented 2 years ago

I dislike the weird syntax of .|x|. Edit: I changed my idea. See below. A better solution is to make function definition

const foo = fn () void {};

This aligns with const foo = struct {};, and function definition is "function literal". Just like if Struct is a type, then Struct{} is constructor of that type, fn () void is a type, and fn () void {} is constructor of that type. With this syntax, nested function is as simple as

const foo = fn () void {
  const bar = fn () void {};
};

To be honest, I want to deprecate fn functionName() {}, because it makes anonymous function less discoverable. The language user then have to read the manual carefully to discover that function type is a feature. Someone has the same criticism: https://www.duskborn.com/posts/2021-aoc-zig/#the-bad

rohlem commented 2 years ago

@locriacyber see #1717

iacore commented 1 year ago

I've tried to look into zig/parser.zig to add this syntax myself, but to no avail. Is there any other effort trying to implement this?

Also, I'm not sure what the syntax should be.

Closest to current syntax:

pub const foo = fn (a: i32) void {};

Separate type and implementation (like struct literal):

pub const foo = fn (i32) void |a| {};

pub const foo1: fn (i32) void = .|a| {};

// this syntax is useful for naming callback types
const A: type =  fn (i32) void;
pub const foo2 = A |a| {};
pub const foo3: A = .|a| {};

// maybe it's more like this?
pub const foo3_stage2: *A = .|a| {};
Vexu commented 1 year ago

Is there any other effort trying to implement this?

No effort has been made since this proposal has not been accepted and any work on it would likely end up being rejected.

Pyrolistical commented 2 months ago

I think this proposal is a bit broad but a lower powered version would greatly improve common refactoring issues.


pub fn main() void {
  const a = A.init();
  defer a.deinit();
  const b = B.init();
  defer b.deinit();
  ...
  const z = Z.init();
  defer z.deinit();

  const e1 = a.foo() + b.foo() + ... + z.foo() + 10;
  const e2 = a.foo() + b.foo() + ... + z.foo() + 100;
}

Status quo, extract a struct

const ExtraStruct = struct {
  a: A,
  b: B,
  ...
  z: Z,

  fn extracted(self: ExtraStruct, param: usize) usize {
     return self.a.foo() + self.b.foo() + ... + self.z.foo() + param;
  }
};

pub fn main() void {
  const a = A.init();
  defer a.deinit();
  const b = B.init();
  defer b.deinit();
  ...
  const z = Z.init();
  defer z.deinit();

  const extra_struct = ExtraStruct{
    .a = a,
    .b = b,
    ...
    .z = z,
  f};

  const e1 = extra_struct.extracted(10);
  const e2 = extra_struct.extracted(100);
}

Extracting a struct is very annoying.

This proposal can help but we don't need its full power. Instead of an anonymous function, all that is needed here is an inlined parameterized block.

Proposed, inline parameterized block

pub fn main() void {
  const a = A.init();
  defer a.deinit();
  const b = B.init();
  defer b.deinit();
  ...
  const z = Z.init();
  defer z.deinit();

  const extracted = inline blk: |param| {
    break :blk a.foo() + b.foo() + ... + z.foo() + param;
  };

  const e1 = extracted(10);
  const e2 = extracted(100);
}

The idea is since extracted is comptime inlined, it is allowed access to lexically scoped variables.

Note that since extracted is a block and not a function, return would return to the outer function, just like normal blocks.

However, since it is a lexically scoped block, I would make it a compile error if it is passed like a closure.

fn foreach(self: @This(), closure: anytype) void {
  for (self.buckets) |bucket| {
    var current = bucket.first;
    while (current) |node| : (current = node.next) {
       closure(&node.key, &node.value);
    }
  }
}

fn countValues(self: @This(), value: Value) usize {
    var count: usize = 0;
    const closure = inline |_, v| {
       if (v.* == value) count += 1;
    };
    self.foreach(closure);  // compile error: `closure` not allowed to escape
    return count;
}

This mean it wouldn't work for https://github.com/ziglang/zig/issues/6965, where this example is adapted from. If it were allowed to escape, it would violate zig's principle of "no hidden control flow" if the block returned.