qt4cg / qtspecs

QT4 specifications
https://qt4cg.org/
Other
28 stars 15 forks source link

Simple mapping operator for arrays #1158

Closed michaelhkay closed 4 months ago

michaelhkay commented 7 months ago

I propose to provide !! as a simple mapping operator for arrays.

For example [(1,2,3), (4.5.6)]!!count(.) returns [3, 3].

The expression on the LHS must be an array.

The expression on the RHS is evaluated once for every member of the array, with that member as the context value, with the context position set to the position of that member in the array, and with the context size set to the array size.

The result is returned as an array which will always be the same size as the input array.

Note in passing that this provides a solution (though perhaps a clumsy solution) to issue #755, in that the example expression

(0 to 4) ~ count(.)

can now be written as [(0 to 4)]!!count(.)?*

MarkNicholls commented 7 months ago

isnt this effectively my suggestion that was deemed irrelevant? i.e. if you embed the value in a non flattening container you can map.

[(0 to 4)]!!count(.)?* though i dont know what

?*

means.

(I do find XPATH is becoming more and more closely related to brainfuck, this is a geniune criticism, I can remember only a few weeks ago being accused of introducing suggestions for more and and esoteric operators, but I'm finding myself feeling the same way)

Its a shame that these array suggestions effectively mirror sequence constructs but have subtly different syntax.

I assume that the rule of thumb is.

if the sequence operator is X then the array operator is !X? (where X is a variable representing an operaotr).

michaelhkay commented 7 months ago

isnt this effectively my suggestion that was deemed irrelevant?

Sorry, I don't know which suggestion you are referring to. However, it's very much my personal style to reject new ideas when I first see them, and then come round to them later when I understand them better, so this is by no means impossible.

Its a shame that these array suggestions effectively mirror sequence constructs but have subtly different syntax.

Yes, indeed so. We pretty well forced ourselves down that path when we decided that an array was a sequence of length one. No turning back from that, unfortunately.

I assume that the rule of thumb is...

The rule of thumb is, try to choose an operator that (a) doesn't make the grammar ambiguous, and (b) has some kind of mnemonic value in terms of its relationship to other operators either in XPath or in other languages.

MarkNicholls commented 7 months ago

However, it's very much my personal style to reject new ideas when I first see them,

nice!

one of the problems I have with F# is that there are several standard data types in the language, that all share the same patterns, e.g. map, bind etc, but in each data types module these functions have different names, now in a world of intellisense and a language where these functions are names (not operators) thats irritating, but not a massive problem, in a world of of operators and and no modules for intellisense to grab onto thats problematic, you just end up spending your day looking up operators in a spec.

It would be nice to at least try to adopt some sort of naming convention that saves 10 visits to the XPath functions and operators specs per day.

so if '!' is a sensible prefix to sequence operators to get the 'corresponding' arrray one, then the proposal makes sense

(in fact you get a whole raft of functions and operators 'for free'

[] => ![]
! => !!

you can extend the same idea to map using '?' as the prefix? (though you would have to have a think what the context item is I assume a map-entry)

you may want to also consider extending this suggestion to include a 'bind' operator for array (obviously a bind operator + unit can subsume map, and bind doesn't exist on sequence as it self/auto binds)

If you don't have a naming convention then I stand by my brainfuck comment.)

MarkNicholls commented 7 months ago

let me spell out the bind alternative/addition

lets call it....

!>>=

(horrid mashup of haskell and the above suggestion)

[(0 to 4)]!>>=[count(.)?*]

would be equivalent I think to

[(0 to 4)]!!count(.)?*

but bind is strictly more powerful than map.

(I still dont know what *? means)

ChristianGruen commented 7 months ago

This looks related to #700; maybe one of the issues is obsolete?

Note in passing that this provides a solution (though perhaps a clumsy solution) to issue #755, in that the example expression

(0 to 4) ~ count(.)

can now be written as [(0 to 4)]!!count(.)?*

To understand the analogy better, how can multiple operations be chained this way? For example, how would the following expression need to be written?

$data ~ subsequence(., 1, 5) ~ count(.)

I see similarities, but I wonder if it can serve as a full and intuitive replacement.

michaelhkay commented 7 months ago

Yes, sorry, it's a duplicate. I've now split filtering and mapping into separate issues though, which is probably clearer.

michaelhkay commented 7 months ago

let me spell out the bind alternative/addition

Please do spell it out. I have no idea from your post what !>>= is, other than perhaps a different symbol for my !! operator. (You mention Haskell, but for readers unfamiliar with Haskell, you need to spell out what part of Haskell you are deriving this from.)

MarkNicholls commented 7 months ago

theres always a danger of teaching granny to suck eggs, and as Wadler was part of XQuery (I think), I wonder how much of the stuff I drone on about is just tumbleweed (silence can be incomprehension or boredom of the obvious).

lets use XQuery as a lingua franca (my XQuery is embryonic)

but forget that sequences auto flatten (i.e. think of these as expressions over arrays), (its hard because the whole sequence thing in XPath etc is boiled in...but try to forget it, we have items and we have arrays)

lets introduce 'map' and 'bind' map, is the obvious

let $doc := doc(base-uri())
for $child in $doc/*
return count($child/*)

this in psuedo xpath is

map( $doc/*, function($child) {count($child/*)})

but map alone cannot represent this

let $doc := doc(base-uri())
for $child in $doc/*
for $grandchild in $child/*
return count($grandchild/*)

because there are two nested 'for' loops

this is

map( bind( $doc/*, function($child) {$child/*}), function($grandchild) {count($grandchild/*)})

(you can put the map in the bind, its equivalent)

i.e. you need both map and bind to do nested for loops

BUT you can express map in terms of bind and unit where 'unit' is the function that takes a single value and puts in an array....i.e.

(u don't need to read this deeply)

let $map := 
   function($array,$mapper) { 
     bind(
        $array,
        function($element) { 
           unit($element) 
        }) 
   }

thats hand written so it will be wrong somewhere....

In psuedo XQuery its just the following trick.

map is only required in the above translation if the 'return' term has a function applied to it, so this

so i simply map this

let $doc := doc(base-uri())
for $child in $doc/*
for $grandchild in $child/*
return count($grandchild/*)

to this

let $doc := doc(base-uri())
for $child in $doc/*
for $grandchild in $child/*
for $count in (count($grandchild/*))
return $count

i.e. i wrap the returned expression in a () (this is usually called 'unit') and we now have an atomic return expression (so we no longer need map)

we cant do the trick the other way around (we need a function to logically 'eliminate' the collection type).

ok, thats lots of (slightly hand waving flawed lingua franca) theory....

so...the summary is

P.S.

the operator for bind in haskell is >>= thus my ugly portmanteau of an '!' convention with the most well known bind operator I could think of.

ChristianGruen commented 7 months ago

theres always a danger of teaching granny to suck eggs, and as Wadler was part of XQuery (I think), I wonder how much of the stuff I drone on about is just tumbleweed (silence can be incomprehension or boredom of the obvious).

Definitely entertaining, either way…

this in psuedo xpath is map( $doc/*, function($child) {count($child/*)})

I guess you are looking for fn:for-each?

We also have array:for-each. I assume that the following expressions would be equivalent:

$array !! EXPR
array:for-each($array, fn { EXPR })
MarkNicholls commented 7 months ago

We also have array:for-each. I assume that the following expressions would be equivalent:

$array !! EXPR
array:for-each($array, fn { EXPR })

achchchch....how to say in 5 lines what took me 50

I assume you don't have bind? I suggest it isnt mad to have bind too.

(it unfortunate you used 'for-each', as nesting for-each functions doesnt do the same as nesting for-each statements....I probably would have done the same, and regretted it)

ChristianGruen commented 7 months ago

(it unfortunate you used 'for-each', as nesting for-each functions doesnt do the same as nesting for-each statements....I probably would have done the same, and regretted it)

I can’t tell who chose it, but it’s popular in various other languages, and I assume yet another map in the language would have been confusing, in particular if the map:for-each function would have been called map:map

I assume you don't have bind? I suggest it isnt mad to have bind too.

I’m not sure if you are missing an operator or a function:

  1. If it’s an operator, and if it’s supposed to be similar to the ! map operator, I wonder whether the pending value map operator would be a candidate (#755)?
  2. If it’s a function, how would the function signature need to look like? This is the signature for fn:for-each; I suppose it would look similar?…
fn:for-each(
  $input   as item()*,  
  $action  as function(item(), xs:integer) as item()*   
) as item()*
ChristianGruen commented 7 months ago

I'm concerned about introducing more and more operators in the absense of conventions that makes it easy to remember

…a fair observation, which I share: There’s nothing about !! that reminds of arrays. When you know / and //, you’d rather expect that !! has recursive or nested semantics. I would prefer a different syntax… But I have nothing useful to offer in return.

MarkNicholls commented 7 months ago

I'm concerned about introducing more and more operators in the absense of conventions that makes it easy to remember

…a fair observation, which I share: There’s nothing about !! that reminds of arrays. When you know / and //, you’d rather expect that !! has recursive or nested semantics. I would prefer a different syntax… But I have nothing useful to offer in return.

I assumed that Mr Kay was just prefixing all array operators with !

MarkNicholls commented 7 months ago

I’m not sure if you are missing an operator or a function:

  1. If it’s an operator, and if it’s supposed to be similar to the ! map operator, I wonder whether the pending value map operator would be a candidate (#755)?
  2. If it’s a function, how would the function signature need to look like? This is the signature for fn:for-each; I suppose it would look similar?…
fn:for-each(
  $input   as item()*,    
  $action  as function(item(), xs:integer) as item()* 
) as item()*

I assume there is a typo in action's type

like map, its either or both an operator or function.

well in sequences bind doesnt really make sense, because sequences cant be nested so map and bind are functionally equivalent,,,but if they could then,,,,

so map (for-each) does this

 fn:map(
    $input   as item()*,    
    $action  as function(item(), xs:item()) as item()*  
 ) as item

and bind does this

 fn:bind(
    $input   as item()*,    
    $action  as function(item(), xs:item()*) as item()* 
 ) as item

and

unit does this (takes a value and embeds it in a sequence of 1 element.

 fn:unit(
    $input   as item()
 ) as item()*

do we have a common language we all dabble in a bit? expressing something that isnt in XPath/XSLT/XQuery in XPath/XSLT/XQuery is difficult at times.

Haskell? no F#? no C#? Java? Python?

I know a few others but not well enough.

for-each IS common, but not as a 'map' function, its usually used for effectively side effects, i.e. actions with no return, in C# map is 'select' and bind is 'selectmany', in scala its 'map' and 'mapflatten', in haskell its 'map' and 'bind' etc etc

I can write the definitions in XQuery probably

ChristianGruen commented 7 months ago

I assume there is a typo in action's type

I’m not sure which one you mean. If there is one, it would need to be fixed in the spec as well (https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-for-each).

for-each IS common, but not as a 'map' function, its usually used for effectively side effects, i.e. actions with no return, in C# map is 'select' and bind is 'selectmany', in scala its 'map' and 'mapflatten', in haskell its 'map' and 'bind' etc etc

Thanks. So you’d like to have an equivalent solution to Scala’s mapflatten function in XQuery, i.e. a function/operator that maps data and flattens the result?

do we have a common language we all dabble in a bit? expressing something that isnt in XPath/XSLT/XQuery in XPath/XSLT/XQuery is difficult at times.

Java may be an alternative, other languages as well, but the main challenge (at least for me) is to understand how you would expect XPath to change.

MarkNicholls commented 7 months ago

sorry my post is wrong let me fix it

MarkNicholls commented 7 months ago

sorry my post is wrong, it was the xs:integer in the foreach signature that threw me, consider this

let $doc := doc(base-uri())
for $child in $doc/*
for $grandchild in $child/*
return count($grandchild/*)

and pretend this query is over an array (i.e. no auto flattening)

then you'd define bind and map like this (this is the equivalent expression to above).

declare function local:map(
    $seq as item()*,
    $mapper as function(item()) as item()) as item()* {
    for $item in $seq
    return $mapper($item)
};

declare function local:bind(
    $seq as item()*,
    $mapper as function(item()) as item()*) as item()* {
    for $item in $seq
    for $childItem in $mapper($item)
    return $childItem
};

let $doc := doc(base-uri())
return 
    local:bind(
        $doc/*,
        function ($child) {
            local:map(
                $child/*,
                function ($grandchild) {
                    count($grandchild/*)
                })
        })

the map definition is what I think Michael Kay is talking about (note no xs:integer), i.e. the scala map. the bind definition is what I'm talking about, i.e. the scala mapFlatten.

I'd love to be good enough at XQuery to write this definition over array, but I'm not, and I think actually it would be quite ugly (because it doesnt exist).

note this is supposed to demonstrate the behaviour of functions that would be in XPath....I'm using XQuery simply as a language to communicate behaviour.

michaelhkay commented 7 months ago

I can’t tell who chose it

As you can probably imagine, we had quite a debate before calling the function for-each() rather than map(). We didn't feel we could introduce the term map for two unrelated concepts at the same time, so we either had to find another name for maps as data objects (dictionaries? associative arrays? objects? hashes? - nothing appealed), or we had to find another name for the function, and since the semantics are very similar to the well-established xsl:for-each instruction, that's what won the day.

We reckoned that far more of our users would know XSLT than would know Haskell or other functional programming languages...

michaelhkay commented 7 months ago

do we have a common language we all dabble in a bit?

Simple answer, no. One of the joys of this business is that we are all steeped in different traditions. Getting people who live and breathe SQL to talk to people who live and breathe Haskell or Javascript is a major undertaking, but a very rewarding one for everyone.

MarkNicholls commented 7 months ago

Call me old fashioned, but I tend to believe speaking a common language is a prerequisite to effective communication, reality is a shared experience, language is just syntax.

MarkNicholls commented 7 months ago

We reckoned that far more of our users would know XSLT than would know Haskell or other functional programming languages...

yes but the issue cascades, a for-each statement isnt just a map, its map + bind, so if you call map, 'for-each' then what do you call 'bind'?

i think thats why these other languages have used for-each for the only case where map and bind coincide i.e. when they return unit.

michaelhkay commented 7 months ago

Since our sequences are a bit different from the lists used by most functional languages, our operators will inevitably be a little different. And one of the challenges in computing has always been that if your X is a little bit different from someone else's X, should you call it X or should you find a different name? There is no right answer.

MarkNicholls commented 7 months ago

Its a common trap, F# is largely borrowed from Haskell, but even there they have made an inconsistent mess of 'bind', sometimes its bind, sometimes its collect, sometimes it doesnt exist...I think that is a wrong answer (there are plenty of those too), if a language (haskell) has gone to the bother of being incredibly formal about its constructs then not following the same structure (if not maybe the same names), would seem foolhardy.

whatever, this is a tangent.

P.S.

what does *? mean?, genuinely, is there some nuance here I haven't comprehended (I'm not trying to make a point about incomprehensible operators, though it is a good example).

ChristianGruen commented 7 months ago

what does *? mean?, genuinely, is there some nuance here I haven't comprehended (I'm not trying to make a point about incomprehensible operators, though it is a good example).

You find some good examples in the spec: https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-unary-lookup

michaelhkay commented 7 months ago

Please explain what Haskell "bind" does. I've found plenty of web sites that claim to explain it in simple terms, but the simple terms require a prior understanding of a whole host of abstract concepts (starting with monads) and a whole raft of algebraic notation. You can't assume that people reading here are up to speed on your favourite language.

I think your question about *? is actually referring to ?*, and I think we should be able to assume that anyone here has a reasonable working knowledge of XPath 3.1.

I tend to believe speaking a common language is a prerequisite

Indeed so, and despite the fact that we come from many different backgrounds, the one thing we should be able to assume is that everyone here understands the current specs.

MarkNicholls commented 7 months ago

what does *? mean?, genuinely, is there some nuance here I haven't comprehended (I'm not trying to make a point about incomprehensible operators, though it is a good example).

You find some good examples in the spec: https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-unary-lookup

ah thankyou, so I assume

[(0 to 4)]!!count(.)?* [5]?* 5 ?

MarkNicholls commented 7 months ago

Please explain what Haskell "bind" does. I've found plenty of web sites that claim to explain it in simple terms, but the simple terms require a prior understanding of a whole host of abstract concepts (starting with monads) and a whole raft of algebraic notation. You can't assume that people reading here are up to speed on your favourite language.

I think your question about *? is actually referring to ?*, and I think we should be able to assume that anyone here has a reasonable working knowledge of XPath 3.1.

I tend to believe speaking a common language is a prerequisite

Indeed so, and despite the fact that we come from many different backgrounds, the one thing we should be able to assume is that everyone here understands the current specs.

that's quite a narrow potential population of contributors. I can try to explain bind if you're interested, or go?

ChristianGruen commented 7 months ago

that's quite a narrow potential population of contributors. I can try to explain bind if you're interested, or go?

For general questions on XPath, https://xmlcom.slack.com or StackOverflow may be better platforms indeed; others can jump in there if needed, and we’ll be able to keep the issue threads focused. I hope that sounds reasonable.

With regard to bind, I would be happy to hear your summary (it’s been a while ago when I spent some time with Haskell).

MarkNicholls commented 7 months ago

It wasn't a general question, I was asking about ?* in the context of this issue, it seemed a reasonable question, I'm happy to try to try to explain bind (though it also seems to me to be a general question that could be asked elsewhere also).

Genuinely if my ramblings are not worth the reading, or I don't 'qualify' to contribute, let me know and I'll take the dog a walk instead, I DO use XPath and XSLT every day, I'm far from an expert in it, or 'familiar' with the specs, i actually think my main value in these discussions is i'm consciously not an expert, I know why some of this is difficult and unintuitive, where I think those that know the spec intimately have forgotten, you are all mostly unconsciously competent, whilst I am consciously incompetent. I am consciously competent in other languages, which I think helps, but if its just noise, then I'll shut up.

I'm happy you are 'happy', but you don't need to indulge me if it has no value.

I'll write an explanation but dont feel obliged to read it, its technically not directly relevant to the issue.

lets take an informal explantion using the word 'maybe/may/might'.

Am I going to post something on this site in the next week that has value? (possibly controversial)

There maybe an issue that interests me. If it does then there maybe something of I know of value and relevance. If I do then, there might decide I have time to do it. ...I might be able to communicate that thing effectively. ...It may be read. ...It may be understood. ...and it has value.

So this is an informal constructive proof that I may post something of value.

so notice each is of the same form, I extract something from the previous expression derive a 'maybe' relationship and continue until I get to the end then I terminate the sentence by returning the noun 'value',

This is effectively a monad. There are two rules here, a) unit...we'll skip...in this context its the above its the pronoun 'Some'...some post, some value, some understanding. b) bind....given a function from a value of type A to a monad over values of type B, then I can give you a function from a monad over A to a monad over B....ouch.

e.g. (and now I need a common language for types...I'll use haskell as you've explicitly mentioned it)

e.g. the functions

read :: Post -> Maybe ReadPost
understand :: ReadPost -> Maybe UnderstoodPost

and bind :: (a -> Monad b) -> Monad a -> Monad b ouch again...

but now given a 'Maybe Post' I can create a function to tell me if it way understood

wasItUnderstood :: Post -> Maybe UnderStoodPost
wasItUnderstood  post = 
   bind understand (read post)

if we were to model my example we would get

example :: issue -> Maybe hasValue
example issue = 
   bind understand (bind read (bind commucateEffectively (bind haveTime (bind knowSomething (interestsMe issue)))))

so bind allows you to chain together these functions to create new functions that you can then bind

so in haskell Philip Wadler and Stephen Blott invented do notation (just syntax sugar) and the above can be written

example :: issue -> Maybe hasValue
example issue = 
   do
      ofInterest <- interestsMe issue
      know <- knowSomething ofInterest
      time <- haveTime know
      canCommunicate <-  commucateEffectively time
      isRead <- read canCommunicate
      understood <- understand isRead
      return understood

(apologies with any more intimate with Haskell syntax for my handwritten code)

notice this looks a lot like

      for ofInterest in interestsMe(issue)
      for know in knowSomething(ofInterest)
      for time in haveTime(know)
      for canCommunicate in commucateEffectively(time)
      for isRead in read(canCommunicate)
      for understood in understand(isRead)
      return understood

obviously XQuery doesnt understand Maybe, but it could, and sequence and array and map are all 'monadic' and can support bind.

note you cannot do this with just map, if you used map to chain my example you would get a

Maybe Maybe Maybe Maybe Maybe Maybe Understood

but you can use map + concat (i.e. array join) but bind is more primitive.

I'll get my coat.

ChristianGruen commented 7 months ago

@MarkNicholls The discussion became quite emotional. I’m sorry if you felt offended and if misunderstood. And…

I'm happy you are 'happy', but you don't need to indulge me if it has no value.

Same here. I’m not a native speaker; there must be better ways to express I appreciate your time. And…

I'll write an explanation but dont feel obliged to read it, its technically not directly relevant to the issue.

Thanks for your summary, which I’ll certainly read. In the meanwhile, I was lazy and asked ChatGPT to give me a non-technical dummy summary, which I’ll add for the sake of completeness:


Imagine you have a box, and inside this box could be either a toy or it could be empty. Now, you want to do something with the toy, like paint it, but you can only proceed if there is actually a toy in the box.

In this scenario, bind (in Haskell) is like a special tool that helps you:

  1. Check the Box: It first opens the box to see if there is a toy inside.
  2. Use the Toy (if it exists): If there is a toy, it takes the toy out, paints it (or does whatever you want with it), and then puts it back into another box.
  3. Handle an Empty Box: If the box is empty, it does nothing and just tells you that there's nothing to do.

This tool is useful because it lets you chain these actions together. For example, if you need to first wash the toy and then paint it, bind helps ensure that you only proceed with washing and painting if you initially found a toy in the box.

This way, you don't have to keep opening the box yourself to check for the toy at every step; bind handles that for you, making the process smoother and error-free.

MarkNicholls commented 7 months ago

@ChristianGruen

My comment to you wasn't intended to sound churlish, apologies, as a native speaker it does.

Like a lot of what chatgpt says...I'm not sure its explanation helps me, there is a danger that you immediately think a monad is a collection, when IO, for example, is monadic but I cant see how to apply its analogy to IO, it doesn't work for me, there is no empty box,, but if I say a function that takes a value and returns a value AND does some IO and its signature is.

printMessageAndGetResult :: string -> IO string then you can clearly see it matches the same syntactic pattern as things like

singleton :: a -> list a
tolazy :: a -> Lazy a
some :: a -> Maybe a
setState :: a -> State a ()
([]) :: a -> array a

bind is simply the mechanism used to chain constructs like these together for a fixed polymorphic type that obeys the rules (there are rules too...but we can ignore them).

P.S.

(this is meant to be light hearted)

chatgpt also says this

In XPath 3.1, the expression ?* does not have a specific meaning

but chatgpt is a pretty low bar.

ndw commented 5 months ago

Marked PRG-required not because it's strictly speaking required, but because its absense is so glaring.

michaelhkay commented 4 months ago

The CG decided not to proceed with this. Reasons include:

(a) concern about the proliferation of operators and the difficulty of remembering them and reading the resulting code

(b) the fact that there are other ways of doing this, e.g. array:for-each and for member $m in $array.

ndw commented 4 months ago

See also additional discussion at meeting 085, 9 July 2024.