mulesoft-labs / data-weave-rfc

RFC for the data weave language
12 stars 6 forks source link

Piping function calls #15

Open jerneyio opened 4 years ago

jerneyio commented 4 years ago

Problem

Given:

third(second(first(1, 2), 3), 4)

I know two available ways to make this more readable:

var firstR = first(1,2)
var secondR = second(firstR, 3)
---
third(secondR, 4)

or

1 first 2 second 3 third 4

This convenience runs out when you have more than two arguments, or a function that only takes one. You must regress to the nested syntax,

third(second(first(1,2,3),4,5),6,7)

use intermediate vars that require you to name results,

vars firstR = first(1,2,3)
vars secondR = second(firstR, 4,5)
...

or infix & prefix notation in the same expression,

third
  (first(1,2,3) second 4,5),
  6,
  7
)

Proposed Solution

Something like this, that takes the result of the previous expression and implicitly passes it as the first argument of the next function:

[1,2,3,4,5]
  | map($ + 1)
  | filter(isOdd($))
  | join([2,3,4,5], (v) -> v, (v) -> v)

It's easy to read regardless of the arity of the functions, and you don't need to waste time naming variables that don't need meaningful names.

Examples of this feature in other languages

machaval commented 4 years ago

Hi @jerneyio Thanks this looks interesting, @menduz and I discuss this in the past but we never finished in nothing concrete. There are a few things we need to discuss

  1. Are we going to support Implicit lambdas
    [1,2,3,4,5]
    | map($ + 1)
    | filter(isOdd($)) //Implicit lambda
    | join([2,3,4,5], (v) -> v, (v) -> v)

    The problem with implicit lambdas is that is either that or allow function by reference

[1,2,3,4,5]
  | map($ + 1)
  | filter(isOdd) //Function by reference
  | join([2,3,4,5], (v) -> v, (v) -> v)

And my second question will be what to do with current infix syntax. Does it make sense to have both. When are be going to advice one over the other.

jerneyio commented 4 years ago

Hmm I didn't know there would be a problem with implicit lambdas and function by reference being mutually exclusive... For the sake of consistency, I'd say it should support the kind of syntax user-made functions support (i.e. only explicit lambdas and function references, no $, $$ or $$$) -

[1,2,3,4,5] 
  | map((v) -> v + 1)
  | filter(isOdd)

For the current infix syntax, I think it should stick around. A lot of code is written with infix notation. It's still useful and easy to understand when you need to chain a few functions that take two args. However, if you need to chain functions with different arity, use the pipe operator to maintain readability.

A question I have is could / should it support single argument functions -

[1,2,3,4,5]
  | map((v) -> v + 1)
  | filter(isOdd)
  | sum
machaval commented 4 years ago

For me for consistency I would require parenthesis on empty args functions

[1,2,3,4,5]
  | map((v) -> v + 1)
  | filter(isOdd)
  | sum()

It is more similar to . in OOP.

menduz commented 4 years ago

It would generate an ambiguity in the grammar because | would no longer be a bitwise operator. I would use something different like |>.

When piping there are two main things you must to support. Being able to specify the first parameter and the last.

In this example I will use clojure's identifier for this operations (-> and ->>):

X -> map(1) becomes map(X, 1)

and

X ->> map(1) becomes map(1, X)

In clojure we use -> and ->> a lot, the main difference is that the lhs part becomes either the first and last element of the parameters list.

In DW it would be |> and |>> but the symbology would become messy and convoluted on symbols. I remember we always tried to avoid that.