FR: Improve functional programming for Motoko

mikhail-turilin-dfinity commented 2 years ago

Problem

Working with collections is verbose and difficult. For example, I want to filter blog posts and return the ids:

  // BEFORE
  public shared ({caller}) func allPosts() : async [Nat] {
    return Iter.toArray(Iter.map<Post, Nat>(Iter.filter(posts.vals(), func(p1 : Post) : Bool { p1.principal == caller }), func(p : Post) : Nat { p.id}));
  };

The example is difficult to understand, write, and debug.

Solution 1: Operator piping

We can use something like a pipe operator from F# to improve readability with Swift-style anonymous functions:

  // AFTER
  public shared ({caller}) func allPosts() : async [Nat] {
    return posts.vals()
      |> filter { $0.principal == caller }
      |> map { $0.id }
      |> Iter.toArray;
  };

I think this example is much easier to read and understand.

Solution 2: Class-level higher-order functions

If a pipe operator is too much to ask, we can at least make higher-order functions a part of the class for chaining

  // AFTER
  public shared ({caller}) func allPosts() : async [Nat] {
    return posts
      .vals()
      .filter( func(p : Post) : Bool { p.principal == caller } )
      .map( func(p : Post) { p.id } )
      .toArray();
  };

crusso commented 2 years ago

Yes, something like this would be nice but note that:

1) F# pipelining operators works best with curried functions, and most of the functions in the current base lib are not curried. Moreover, declaring curried functions in Motoko is currently awkward and would require more sugar.

2) Literally adding methods to, e.g. the Iter interface bloats the representation of these objects, which, in the current design of Motoko, are literally records of fields (some of which are functions/methods). That's a very different representation from objects in C# and Java, which just carry the fields of the object, but share all method implementations via a method table pointer (so that adding a method leaves the object representation unchanged).

This FR is related to https://github.com/dfinity/motoko/issues/2537, which proposes adding something akin to C# extension methods (a.k.a F# augmentations) to let you invoke a static function using the dot notation, passing the receiver as the first argument.

crusso commented 2 years ago

It's a little easier to understand your original code if you break it up using a let.

  // Refactor using `let` and explicit type arguments and untyped lambdas
  public shared ({caller}) func allPostsLet1() : async [Nat] {
    let vals = posts.vals();
    let filtered = Iter.filter<Post>(vals, func p { p.principal == caller});
    let ids = Iter.map<Post, Nat>(filtered, func p { p.id });
    Iter.toArray(ids)
  };

  // Refactor using `let` and implicit type arguments, but typed lambdas
  public shared ({caller}) func allPostsLet2() : async [Nat] {
    let vals = posts.vals();
    let filtered = Iter.filter(vals, func (p : Post)  : Bool { p.principal == caller});
    let ids = Iter.map(filtered, func (p : Post)  : Nat { p.id });
    Iter.toArray(ids)
  };

This is actually more efficient than using any higher-order combinator (in the current implementation) since we don't inline those yet and the all the calls will wind up being direct calls.

But yes, having to name the intermediate results is less concise.

https://m7sm4-2iaaa-aaaab-qabra-cai.raw.ic0.app/?tag=3719590015

ggreif commented 2 years ago

Many func p { p.principal == caller}-style lambdas can be simplified by func p = p.principal == caller-style.

aterga commented 2 years ago

To throw in my two rappens:

could we add a syntax for anonymous function arguments, like in Scala? E.g., I would love to be able to write

_.principal == caller

instead of

func p = p.principal == caller

when the type of the lambda-expression can be inferred unambiguously.

ggreif commented 4 months ago

In many cases function types containing type parameters behave differently that without:

> ((func (c : Char, t : Text) : (Int, Int) = (1,2)) : ((Char, Text)) -> ((Int, Int)));
<func> : ((Char, Text)) -> ((Int, Int))

By annotating a monomorphic multi-arg/return function to be single (compound) arg/return one can change the calling modalities. But as soon as a type parameter comes into play, this is not possible, as type parameters are unary, consider:

> (func <T>(v : T) : (Int, Int) = (1,2))<(Char, Text)>('H',"ello");
(1, 2) : (Int, Int)

Here an instantiation can be added for a call, but we cannot annotate the function to be multi-arg:

> ((func <T>(v : T) : (Int, Int) = (1,2)) : (Char, Text) -> (Int, Int))('H',"ello");
stdin:5.3-5.39: type error [M0096], expression of type
  <T>T -> (Int, Int)
cannot produce expected type
  (Char, Text) -> (Int, Int)

even if we annotate it as unary:

> ((func <T>(v : T) : (Int, Int) = (1,2)) : ((Char, Text)) -> (Int, Int))('H',"ello");
stdin:6.3-6.39: type error [M0096], expression of type
  <T>T -> (Int, Int)
cannot produce expected type
  ((Char, Text)) -> (Int, Int)

Even arity-changing on the monomorphic return-side is disallowed:

> ((func <T>(v : T) : (Int, Int) = (1,2)) : <T>T -> ((Int, Int)))('H',"ello");
stdin:7.3-7.39: type error [M0096], expression of type
  <T>T -> (Int, Int)
cannot produce expected type
  <T>T -> ((Int, Int))

But when building IR functions "by hand" (e.g. coerce in async.ml), the arity changing works out just by doing the call. So the calling convention already gets figured out.

For the record, here is the simplest annotation that does go wrong (but shouldn't)

> (func <T>(v : T) : () = ()) : <U>U -> (());
stdin:18.2-18.27: type error [M0096], expression of type
  <T>T -> ()
cannot produce expected type
  <U>U -> (())

the relevant AST diff being

-      (FuncT Local (U (PrimT Any)) (PathT (IdH U)) (TupT))
+      (FuncT Local (U (PrimT Any)) (PathT (IdH U)) (ParT (TupT)))

as the second argument of AnnotE.

The big difference between the two modi comes from check_exp' in typing.ml:

  | FuncE (_, shared_pat,  [], pat, typ_opt, _sugar, exp), T.Func (s, c, [], ts1, ts2) ->

as we can see only parameterless functions are checked, otherwise the FuncE is inferred and then a subtype check is done. This way the softening of the arg/return sequences is not happening!

I am trying to flesh this out in #4620.

dfinity / motoko