jmespath / jmespath.jep

Proposals for extending the JMESPath Language.
8 stars 5 forks source link

[Initial feedback] Potential proposal for user-defined function #25

Open springcomp opened 1 year ago

springcomp commented 1 year ago

I’m investigating user-defined lambda-like functions as discussed here.

Is this something of interest to discuss here?

User-defined functions

JMESPath already supports expression-type which are, conceptually, anonymous functions without arguments.

I have toyed with the idea of extending exprefs to support named arguments. An updated grammar that works looks like this:

expression-type = "&" expression / "<" arguments ">" "=>" expression
arguments = variable-ref *( "," variable-ref

It requires a new token => which is even not absolutely necessary. Parsing involves a new entry in the nud() function to match on token < (less-than-sign).

Reduce function

As expression-type are only supported as function arguments, a reduce() function could work like this:

reduce(array $array, any $seed, expr [any, any]->any)

Example:

[1, 3, 5, 7]
reduce(@, `1`, <$acc, $cur> => $acc × $cur)

The reduce() function knows how to iterate over its first array argument and repeatedly create bindings for both function arguments $acc and $cur.

Funnily enough, the lambda function does not support natively the @ current node. An argument can be made that it could be similar to the $cur argument. In that case, a choice must be made as to which "context" is specified when evaluating the expref.

Reusing user-defined functions

I then investigated what reusing user-defined functions could look like. This requires a mechanism to store or hold functions, possibly by name, as well as a way to call those functions with appropriate parameters.

By taking advantage of the hopefully upcoming let-expression design, one could bind the anonymous function to a variable reference. Then, syntax for calling the function using this variable must be extended and supported in the grammar.

In the following expression:

let $concat = <$lhs, $rhs> => join('-', [$lhs, $rhs])
in  $concat('4', '2')

$concat is bound to the <$lhs, $rhs> => join('-', [$lhs, $rhs]) lambda expression-type. Then, $concat('4', '2') invokes the function indirectly using the $concat variable reference.

The following grammar changes is required:

function-expression = ( unquoted-string / variable-ref )  no-args  / one-or-more-args ) 

Again, implementation is quite easy.

Funnily enough, having also implemented arithmetic-expression grammar rules, I was then able to implement the recursive Fibonacci sequence using the following expression:

let $fib = <$n> => (
      ($n == `0` && `0`) ||
      (($n == `1` && `1`) ||
      ( $fib($n - `1`) + $fib($n - `2`) )
      )
    ) in
      $fib(@)
jamesls commented 1 year ago

Is this something of interest to discuss here?

Yes, definitely. Overall I think the high level idea could work (interestingly, I think with let expressions and this feature JMESPath would be turing complete).

Syntactically I'd probably suggest using the typescript syntax for arrow functions, ($a) => ... so it feels familiar to anyone that's used javascript/typescript.

The main high level thing that I think needs consideration is the tradeoffs in adding this to the language. It certainly enables types of queries that aren't possible today, but it also potentially makes the language more complex and can make it harder for people to learn/understand expressions. That's not to say we shouldn't do it, but I think we need to make a really compelling case as to why the tradeoffs favor adding it to the language.

Curious to hear what others think.

springcomp commented 1 year ago

Syntactically I'd probably suggest using the typescript syntax for arrow functions, ($a) => ... so it feels familiar to anyone that's used javascript/typescript.

I have indeed tried using this intuitive syntax, but that is a tiny bit more complex to parse, as it looks like a paren-expression. So you need to lookahead two tokens in advance. This is not a problem for top-down parser implementations, but could be for implementations using a more traditional lex/yacc based parser.

But indeed, I support the idea.