Open michaelhkay opened 11 months ago
I'm not familiar with chain...for functional composition its more normal to use an operator.
items-where($seq, not#1 >> contains(?,"e"))
it also mirrors the use of '=>'
Indeed, there are a few varieties of arrow that we could assign (>>
is not available), but the first thing is to establish the principle.
One of the problems in usability terms is the directionality. For functional programming aficionados, writing f(g(x)) as (f ~> g)(x)
seems natural. But object-oriented languages write the pipeline from left to right x.g().f()
, and the =>
arrow syntax reflects this, x => contains() => not()
.
tbh, OO languages rarely have function composition, your example x.g().f()
isnt composition its application, in fact the notion of function composition is a bit odd in an OO world because composition is about composing behaviour without having the data, and obviously in OO you can't have behaviours without data to bind to.
If I had to do it in an OO language I would probably write some sort of extension method on Func<A,B>
called Then
and go
x.f.Then(y.g)
and you'd need x and y in advance, else something like
(x => x.f()).Then(y => y.g())
if you didnt.
as xslt and xpath arent really setup in an OO, this would seem dissonant to me.
We already have 'pipe' i.e. =>
it would seem odd to not have a compose operator =>
that doesn't mirror its usage.
So I like the suggestion, I'd just be careful of the syntax
I think the arrow operator has proved so popular for two reasons: firstly, it cuts out the nested parentheses, and secondly, you read it as a pipeline of operations to be performed from left to right, x => h => g => f
, in contrast to f(g(h(x)))
where you evaluate x
first , then apply h
, then g
, then f
.
The challenge with function composition is that one ordering seems natural and intuitive to one group of people, and the opposite ordering seems natural and intuitive to a different group.
I agree, so the arrow operator is popular and a success (even with OOers?) but you think composition as an operator is problematic
even though
x => h => g => f
is equivalent to
x => (h >> g >> f)
?
This ordering is also quite prevalent in the OO community in the form of 'fluent' APIS, where it would probably be written.
x.Apply(h.Then(g).Then(f))
.
now we're just discussing whether OOers balk at an operator rather than a method name and a '.'.
I think you underestimate OOers (I say so as a former OO zealot)
I thought I had a great opportunity to use
fn:chain
the other day, and then found it didn't do what I wanted.I wanted to negate a predicate: in pseudo-code
items-where($seq, not contains(?, "e"))
and I thought I could do this by chaining
contains
andnot
. But it doesn't work that way:fn:chain
applies a sequence of functions to an argument, it doesn't compose a sequence of functions to yield a new function.
Actually, fn:chain
can be used as function composition:
items-where($seq, chain(?, (contains(?, "e"), not#1 )) )
Actually, fn:chain can be used as function composition:
Thanks, good to know.
I thought I had a great opportunity to use
fn:chain
the other day, and then found it didn't do what I wanted. I wanted to negate a predicate: in pseudo-codeitems-where($seq, not contains(?, "e"))
and I thought I could do this by chainingcontains
andnot
. But it doesn't work that way:fn:chain
applies a sequence of functions to an argument, it doesn't compose a sequence of functions to yield a new function.Actually,
fn:chain
can be used as function composition:
items-where($seq, chain(?, (contains(?, "e"), not#1 )) )
its pretty ugly though
much better would be (using ".>" as composition as ">>" is taken and the mathematicians "." is quite problematic)
items-where($seq, contains(?, "e") .> not#1 )
I thought I had a great opportunity to use
fn:chain
the other day, and then found it didn't do what I wanted. I wanted to negate a predicate: in pseudo-codeitems-where($seq, not contains(?, "e"))
and I thought I could do this by chainingcontains
andnot
. But it doesn't work that way:fn:chain
applies a sequence of functions to an argument, it doesn't compose a sequence of functions to yield a new function.Actually,
fn:chain
can be used as function composition:items-where($seq, chain(?, (contains(?, "e"), not#1 )) )
its pretty ugly though
much better would be (using ".>" as composition as ">>" is taken and the mathematicians "." is quite problematic)
items-where($seq, contains(?, "e") .> not#1 )
I disagree. We have used all possible keyboard symbols for operators and now need more ...
Who is to remember all "magic strings" ? Almost no one, even their creators.
As for "beauty", beauty is in the eyes of the beholder. Enough with all ugly, long and beyond understanding lambda expressions!
What is not so very subjective as "beauty" is understandability.
And it has been more than 10 years since we started using "?" in the place of an argument to express partial application. No one has claimed this is "ugly" or not useful.
Trying to invent "magic strings" for well-understood concepts that we have had for years is contributing to the pollution of the language. Not to mention that such amount of effort could be directed towards something that could be really useful.
I don't understand,
I LIKE "?" as shorthand for a lambda expression, i wasnt objecting to it.
It seems like a very useful and memorable "magic string". I like "=>" it seems also useful and memorable, similarly, "+", "-", "="....all very sensible "magic strings".
what seems less understandable is expression composition in terms of a function that does composition AND application by effectively turning it into a lambda.
contains(?, "e") .> not#1
seems more understandable (if ".>" were part of language and as familiar as "=>"), than
chain(?,(contains(?, "e"), not#1))
`
contains(?, "e") .> not#1
seems more understandable (if ".>" were part of language and as familiar as "=>"), than
chain(?,(contains(?, "e"), not#1))
For me it is the opposite:
chain(?,(contains(?, "e"), not#1))
is a function that still needs its first argument, and when given this, it will perform on it the chain of all other specified functions.
Absolutely clear.
where with:
contains(?, "e") .> not#1
I need to think hard what .>
was, whether it applies from right-to-left or from left=to-right, etc.
And why in one case (to the left) ?
is used, but then inconsistently, .>
is used?
And why should I need to learn and memorize yet even new operators, when we clearly can express this in the current language with consistent use of partial application:
chain(?,(contains(?, "e"), not#1))
Its not obvious to me, and it wasn't obvious to the OP, thus the question, the whole topic is based on it not being obvious.
the right to left questions seems odd, the direction is indicated by ">" in exactly the same way as it does for "=>", they are very closely linked, and thus very similarly named.
I don't understand the question about "?" being inconsistent to ".>", the analogous expression using "=>" would be
x => contains(?, "e") => not#1
if there is no inconsistency in this expression, then there would seem to be no inconsistency in the one I presented.
The question is "fn:chain or fn:compose"
the main question to me isn't the syntax (operator vs function) rather than the definition.
if we choose chain, then we don't need to memorise the function compose. if we choose compose then we don't need to memorise the function chain AND memorise the trick to convert it into the more primitive compose.
If you set up both alternatives and a compose expression and a chain expression and expressed it in both options, you find that compose is syntactically simpler (in terms of syntax tree size), but not the whole story. As a non XSLT/XPath developer (or at least a relatively ignorant one) I would expect compose to exist, because compose exists in lots of languages, because its a primitive operation on functions, and people are taught in in college on SE/CS courses. I would not expect (or look for) chain to exist as a base function, I'm not aware of existing elsewhere (it may do).
I prefer the operator form because
a) its syntactically even simpler than "fn:compose()". b) it directly mirrors the current "=>" operator.
x => f => g => h
is equivalent to
x => (f .> g .> h)
I do note though that scala and kotlin do use an operator but both languages allow "names" for operators
so in scala it would read
f andThen g andThen h
kotlin would read
f then g then h
I'm sympathetic to the "no more magic strings" message, but that boat seems to have already sailed with "/", "!", "=>", so it would seem operators have to be "magic strings"
"then"
to me is a perfectly acceptable operator, if that disolves part of the objection.
@MarkNicholls The history of fn:chain
is that I initially proposed a function multiCompose
as a generalization of function composition for more than 2 functions, that would, exactly as the mathematically-defined functional composition, apply the provided sequence of functions starting from the right-most - that is from right to left.
Many people wished the direction to be left to right, and this is how function chain was born.
As for whether or not it is more convenient to return a function or produce a result (item()*
) there are plusses and minuses for both of these alternatives and neither one is clearly superior compared to the other.
People, who are not comfortable with functional programming, will want the result to be directly produced. Otherwise, they must append (some-initial-argument)
to the right of the function call, thus they have to add one more pair of parentheses, and even worse, the initial argument will be farthest away from the first function (the leftmost one) that is to be applied on it. Clearly inconvenient, less-readable, and maybe even confusing.
We don't want to have too-many functions and specifically two functions that are almost the same. This is why I started with multiCompose (from right to left) but taking into account the preferences of the majority, this evolved into fn:chain that is left-to-right and in a most convenient and readable way has the initial (non-function) argument as its left-most one, where it belongs.
I think this is a complete explanation why the things are as they are.
I will not provide more answers, because all is said in this reply. If you try to convince the majority of people here that right to left evaluation is better and if you succeed, it is simply a mechanical task to turn fn:chain to whatever you are suggesting.
As I said, both alternatives are OK with me, but I think people are right to request what they consider the most convenient syntax and evaluation order, and at present fn:chain gives them what they wanted -- without needing a single new operator.
b) it directly mirrors the current "=>" operator.
x => f => g => h
is equivalent to `x => (f .> g .> h)
x => f => g => h
is invalid syntax in XPath. Try for example:
(1, 2, 3) => sum#1
We get an error: "Expecting '(' "
The syntactically valid expression requires using not just a function name, but a function call:
(1, 2, 3) => sum()
Thus you will need:
x => f(all_args_here_except_the_first) => g(all_args_here_except_the_first) => h(all_args_here_except_the_first)
Unfortunately, this is not too-elegant, is less readable, and could even be considered rather ugly.
f then g then h
I'm sympathetic to the "no more magic strings" message, but that boat seems to have already sailed with "/", "!", "=>", so it would seem operators have to be "magic strings"
"then" to me is a perfectly acceptable operator, if that disolves part of the objection.
then
is already used in XPath as one of the alternatives of an if
expression.
Also, in JavaScript/Typescript then
is used to resolve a promise , that is in concurrent processing (similar to await a Task in C#). We may also decide to use then
in a similar role if in the future we introduce concurrent processing in XPath.
Thus it could be confusing to use then
in yet another, 3rd role.
so the real alternative for multicompose was
x => multicompose((f,g))()
or
multicompose((f,g))(x)
vs
chain(x,(f,g))
To me all both options are not ideal. I think I prefer multicompose, because it captures the primitive notion directly. The order I'm on the fence about, given a function syntax, where right to left application takes precedence, actually right to left seems more consistent, despite the fact I personally find left to right more intuitive. (i.e. (a->b) -> (b ->c) -> (a -> c))
I throw broad generalisations, psuedo code and suggestions around, ignorant of the nuances of the actual syntax.
You shouldnt take my psuedo code as too literally, its there to express a feeling rather than a concrete proposal. I accept at some stage the nuances/reality of the syntax can be the difference between something simple and obvious rather than irritating and awkward.
But these nuances seem to hamper multiCompose, but not an operator?
if 'then' is taken, 'andThen' seems a reasonable alternative.
f andThen g andThen h
(I'd personally be happy with a magic string with a '>' embedded in it somewhere too to mirror "=>").
In #517 (Dimitre’s original proposal for fn:multi-compose
), @line-o, who wrote the xBow library, referenced the fn:compose
function from John Snelson’s functional.xq library, which doesn’t process input, but merely combines functions.
I think that both fn:chain
and fn:compose
can be beneficial: fn:chain
works fine when you have some input at hand. fn:compose
looks more intuitive to me when you want to create a self-contained function item.
As the number of (even experienced) users who’ll be capable of deciphering filter($seq, compose((contains(?, "e"), not#1)))
, or even filter($seq, chain(?, (contains(?, "e"), not#1)))
, will be very limited, I would advise against introducing new operators at this stage. It could still be an option for the future.
Here’s a version that may be slightly better readable (just in case, for those who’ve never heard of function composition):
(: returns ('ab', 'cd') :)
let $input := ('ab', 'cd', 'ef')
return filter($input, compose( (contains(?, 'e'), not(?)) )
(: …or even… :)
let $contains-function := contains(?, 'e')
let $not-function := not(?)
return filter($seq, compose(($contains-function, $not-function)))
Similar to fn:concat
, to reduce parentheses we could make it a variadic function:
compose($contains-function, $not-function)
For what it’s worth, an equivalent implementation in XQuery would be:
declare function compose(
$functions as function(*)*
) as function(*) {
fold-left($functions, identity#1, fn($composed, $function) {
fn($arg) { $function($composed($arg)) }
})
};
if compose/chain is a function, then I'm personally open minded about left to right or right to left, I can see the arguments for both,
I think this is pretty self explanatory though, and makes the right/left thing pretty obvious.
filter($seq, contains(?, "e") andThen not#1)
and I'll leave it at that
I think this is pretty self explanatory though, and makes the right/left thing pretty obvious.
filter($seq, contains(?, "e") andThen not#1)
and I'll leave it at that
I agree, a keyword would make the question of order irrelevant. In XPath lingo it could be A and then B
(analogous to cast as
, contains text
, etc.).
if compose/chain is a function, then I'm personally open minded about left to right or right to left, I can see the arguments for both,
I think this is pretty self explanatory though, and makes the right/left thing pretty obvious.
filter($seq, contains(?, "e") andThen not#1)
and I'll leave it at that
This is simply not good.
nobody says contains(?, "e") andThen not#1
People form their expressions as not(someCondition)
Notice that the not is leftmost. This is how we, human beings think. Putting the not
at the end is so confusing, almost German (apologies to @ChristianGruen 😄 )
So, dear friends, I don't believe we could get some breakthrough in this issue...
We could perhaps devise something that is the 180 degrees reflection of fn:chain, but this will also have its own shortcomings, as it is merely a reflection.
Let us use our energy for something more useful, like producing an abstract base for any Kollection 😄
In XPath lingo it could be A and then B
No, I think that gives an ambiguous parse. A and then (X)
could parse as A and (then (X))
where then(X) is a function call. The difference from other two-word operators is that the first word is already in use as a binary operator.
No, I think that gives an ambiguous parse.
A and then (X)
could parse asA and (then (X))
where then(X) is a function call. The difference from other two-word operators is that the first word is already in use as a binary operator.
True, thanks.
My comment generated a lot of discussion about whether it might be better to use an operator rather than a function.
I would like to return to the original issue that I raised, which is whether we should have chain($input, (F, G))
that returns F(G($input)), or chain((F, G))
that returns a function that can be applied to $input
.
Dimitre has shown that you can derive the second from the first by using ?
in the first argument position, but that feels artificial to me. In the use cases I have personally encountered, the second form is the one I have wanted: for example
index-where($input, chain((not#1, contains(?, 'e'))))
My comment generated a lot of discussion about whether it might be better to use an operator rather than a function.
I would like to return to the original issue that I raised, which is whether we should have
chain($input, (F, G))
that returns F(G($input)), orchain((F, G))
that returns a function that can be applied to$input
.Dimitre has shown that you can derive the second from the first by using
?
in the first argument position, but that feels artificial to me. In the use cases I have personally encountered, the second form is the one I have wanted: for example
index-where($input, chain((not#1, contains(?, 'e'))))
Nothing prevents us from having a second overload for fn:chain that accepts only a single argument - a sequence of functions:
*`fn:chain( $functions as function() ) as function()`**
Nothing prevents us...
Nothing apart from precedent, convention, and consistency. We do not currently define any function families containing multiple functions with the same name and different arity having different result types.
I would like to return to the original issue that I raised, which is whether we should have
chain($input, (F, G))
that returns F(G($input)), orchain((F, G))
that returns a function that can be applied to$input
.
I can see that there are use cases for fn:chain
as it is currently defined.
My favorite would be to additionally have fn:compose
(returning a function).
Nothing prevents us...
Nothing apart from precedent, convention, and consistency. We do not currently define any function families containing multiple functions with the same name and different arity having different result types.
Different result types? Really?
isn't function(*)
an item()*
?
as for the different arity, we actually do have many functions that have the same name and different arity (not to mention fn:concat ... 😄 ) - this is the main reason why in a named function reference we must specify the arity - like fn:substring#2 and fn:substring#3 ... and virtually any function that has default value(s) for some of its argument(s)
I would like to return to the original issue that I raised, which is whether we should have
chain($input, (F, G))
that returns F(G($input)), orchain((F, G))
that returns a function that can be applied to$input
.I can see that there are use cases for
fn:chain
as it is currently defined. My favorite would be to additionally havefn:compose
(returning a function).
@ChristianGruen Yes, and if we remember well, this started as a proposal for a multi-compose function. But then people asked for left-to-right order of evaluation of the chain of functions and this is how we got fn:chain.
If we have functions with different order of (directions of) evaluation, my preference for their naming would be:
fn:chain
and fn:chain-from-right
and not compose
, because most people know that compose applies right-to left order of evaluation and having it evaluate from left-to-right would bring significant confusion.
Personally I am happy even with just having fn:chain
as currently defined, and using fn:chain(?, funcList)
whenever needed.
multicompose(funcList)(x)
is perfectly readable and understandable, because x
is at the right-most position and all the functions are applied on it starting from the right-most (closest to x
) in a right-to-left order.
Contrary to this, fn:chain(funcList)(x)
is deeply disturbing and unreadable, because in order for me to understand which function is applied on x
I have to traverse from x
to the leftmost function - and imagine if there are a dozen or more such functions in funcList ... . This is a heavy load for the brain and thus is easily confusing and error-prone.
So, let us not unnecessarily bloat the set of functions - at least we will not end up with a 5000 pages Spec.
I agree with @michaelhkay, chain feels at least non primitive to me compared to compose. As for the order of the composition, I think that depends on whether its an operator or not and how its named. Yes, I agree with @dnovatchev, calling it 'compose' (or using the mathematical notation ".") but then not following the mathematical definition of
(b->c) -> (a -> b) -> (a -> c)
(apologies for the functional signature but I think it makes it clearest).
is confusing to people familiar with the textbook definition.
BUT, most implementations I see don't call it compose or "." and use the (and I think) more intuitive
(a -> b) -> (b -> c) -> (a -> c)
(I actually had to write the 1st signature twice because I got it wrong)
So...
"(.>) Read as "compose forward" or "and then". Use this to create long chains of computation that suggest which direction things move in."
i.e. its seems intuitive that
(f .> g .> h .> i) x
is f and then g and then h and then i applied to x.
One argument for a pure composition function is that it would be more general than what fn:chain
currently does – and in principle, we should always try to provide general functionality in a spec.
One argument for a pure composition function is that it would be more general than what
fn:chain
currently does – and in principle, we should always try to provide general functionality in a spec.
As I said earlier, we can have both fn:chain
and fn:multi-compose
- and I initially proposed the latter, but the majority of people here wanted a chain of left-to-write evaluation, and this is what fn:chain
does.
Please, feel free to propose fn:multi-compose
as generalization of mathematical composition - I would be glad that people have seen what I initially wanted.
As for "more general" vs. "more convenient", there are both pros and cons for each - some people prefer not to have to write an additional pair of parentheses every time, other people want not to need to express a partial application.
As there is no obvious "best" alternative, any argumentation for one or the other would be inconclusive.
I am strongly in favour of adding a compose
- or combine
- function.
The real-world use-case I encountered in the past is: building up a sequence of filters, transforms and sorting based on user-input which is then applied as one function call to a data set (in a search function for example).
As a result of the above a function to combine functions should allow a sequence rather than being variadic. In a variadic call one needs to know all functions at compile time. Please correct me if I am wrong here. If it would be variadic then the first parameter should accept a sequence to allow the combination of a sequence of functions only known at runtime.
If the function is variadic then you can either do compose(f1, f2, f3)
or equivalently compose((f1, f2, f3))
- in the first case you need to know statically how many functions you are composing, but of course you don't need to know statically what functions they are. In the second case it's completely dynamic.
Thank you @michaelhkay for making that clear. Is a mixture also possible as in the example below?
compose($function-sequence, serialize(?, { "method": $method }))
Yes.
I thought I had a great opportunity to use
fn:chain
the other day, and then found it didn't do what I wanted.I wanted to negate a predicate: in pseudo-code
items-where($seq, not contains(?, "e"))
and I thought I could do this by chaining
contains
andnot
. But it doesn't work that way:fn:chain
applies a sequence of functions to an argument, it doesn't compose a sequence of functions to yield a new function.I wonder if a function that composes functions would be more useful, so I could write
items-where($seq, compose((contains(?, "e"), not#1)))