qt4cg / qtspecs

QT4 specifications
https://qt4cg.org/
Other
28 stars 15 forks source link

The trouble with XPath‘s fn:fold-right. A fix and Proposal for fn:fold-lazy #670

Open dnovatchev opened 1 year ago

dnovatchev commented 1 year ago

The trouble with XPath‘s fn:fold-right.
Laziness in XPath.

This article discusses the standard XPath 3.1 function fn:fold-right, its definition in the official Spec, its lack of apparent use-cases and its utter failure to reproduce the (lazy) behavior of Haskell’s foldr , which is presumed to be the motivation behind fn:fold-right.
The 2nd part of the article introduces the implementation of short-circuiting and generators, which together unprecedentedly provide laziness in XPath. Based on these, a new XPath function: fn:fold-lazy is implemented, that utilizes laziness, similar to Haskell’s foldr. This behavior is demonstrated in specific examples

Introduction

Higher order functions were introduced into XPath starting with version 3.0 in 2014 and later in version 3.1 in 2017.
The definition of the standard function fn:fold-right closely mimics that of Haskell’s foldr, and anyone acquainted with foldr can be left with the impression that fn:fold-right would have identical behavior (and hence use-cases) as Haskell’s foldr.

Unfortunately, there is a critical difference between the definitions of these two functions. Whereas the definition of foldr explicitly defines its behavior when provided with a function, lazy in its 1st argument – from Haskell’s definition of foldr:

“… Note that since the head of the resulting expression is produced by an application of the operator to the first element of the list, given an operator lazy in its right argument, foldr can produce a terminating expression from an unbounded list.”

The XPath definition of fn:fold-right does not mention any laziness.

There is no official concept of “laziness” in XPath, thus fn:fold-right doesn’t cover some of the most important use-cases of Haskell’s foldr , which can successfully produce a result when passed an infinite (or having unlimited length) list.

This in fact makes fn:fold-right almost useless, and explains why even some of the members of the XPath 3.1 WG have stated on occasions that they do not see why the function was introduced.

fn:fold-right gone wrong – example

This Haskell code:

foldr (\x y -> (if x == 0 then 0 else x*y)) 1 (map (\x -> x - 15) [1 ..1000000])

foldr (\x y -> (if x == 0 then 0 else x*y)) 1 (map (\x -> x - 15) [1 ..10000000])

foldr (\x y -> (if x == 0 then 0 else x*y)) 1 (map (\x -> x - 15) [1 ..])

produces the product of all numbers in the following list, respectively:

[-14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, …, 999985]

[-14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, …, 9999985]

[-14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, …, ] -- up to infinity.

Because all these 3 lists contain a zero as their 15th item, the expected result is 0 when evaluating any of these 3 expressions – even in the last case where the provided as argument list is infinite. And this is indeed what happens:

image

Not only Haskell produces the correct result in all cases, but regardless of the list’s length, the result is produced instantaneously!

Now, let us evaluate this equivalent XPath expression with BaseX:

let $product := function($input as array(xs:integer)) as xs:integer
                         { 
                           array:fold-right($input, 1, function($x as xs:integer, $y as xs:integer) as  xs:integer 
                                                               {if($x eq 0) then 0 else $x * $y}) 
                         },
    $ar := array { (1 to 36) ! function($x as xs:integer) as xs:integer {$x -15}(.)}
  return
     $product($ar)

Here we are passing a list containing just 36 integers. The result is quite unexpected and spectacular:

image

Here is what happens:

  1. Even though when processing the 15th integer in the array the result is 0, the XPath processor continues to evaluate the RHS (right-hand side) until the last member of the array (36).

  2. On “its way back” the XPath processor multiplies: (36*35*34*33*32* …*6*5*4)*3, and the result of the right-most multiplication is bigger than the maximum integer (or decimal) that this XPath processor supports.

  3. C r r r a s s h … As seen in the screenshot above.

The root cause for this unfortunate behavior is that the XPath processor doesn’t support short-circuiting and laziness. And thus, fn:fold-right is useless even in the normal/trivial case of a collection (array) with only 36 members. Not to speak of collections containing millions of members, or even infinite ones…

Let us see what happens when evaluating similar expressions with another XPath processor: Saxon.

Saxon seems to produce the correct result, however it takes exponentially longer times when the length of the passed array is increased, leading to this one:

image

It took 261 seconds for the evaluation to be done, but accessing the 15th member of the array and short-circuiting to 0 should be almost instantaneous…
So what happens in this case? The difference between BaseX and Saxon is that Saxon implements a “Big Integer” and thus can multiply almost 1 000 000 integers without getting a value that cannot be handled… But doing almost 1M multiplications of big integers obviously takes time …

What is common in these two examples? Obviously, neither BaseX nor Saxon detects and performs short-circuiting. Why is this? What is the reason for this?

I asked a developer of BaseX if I could submit a bug about this behavior. His answer was shockingly unexpected: “This is not a bug, because no requirement in the Specification has been violated”.

Thus, the main cause of the common behavior of both XPath processors to handle the evaluation of these examples, is the specification of the function, which blatantly allows such crap to happen.

Now that we see this, let us try to provide the wanted, useful behavior writing our own function.

The fix: Step 1 – fn:fold-right in pure XPath

Before going in depth with our pure XPath solution, we need as a base a pure-XPath implementation of fn:fold-right .

 let $fold-right-inner := function ($seq as item()*,
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $self as function(*)
                                   ) as item()*
{
  if(empty($seq)) then $zero
    else
      $f(head($seq), $self(tail($seq), $zero, $f, $self))
},

    $fold-right := function ($seq as item()*,
                             $zero as item()*,
                             $f as function(item(), item()*) as item()* 
                            ) as item()*
{
  $fold-right-inner($seq, $zero, $f, $fold-right-inner)
},

   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {$x * $y}

   return
     $fold-right((1 to 6) ! function($x){$x - 3}(.), 1, $fMult)

When we evaluate the above with any of the two XPath processors, the correct result is produced:

0

And we certainly do have exactly the same problems as the provided built-in fn:fold-right with a similar example:

image

The fix: Step 2 – $fold-right-sc detecting and performing short-circuiting

Now that we have $fold-right as a base, let us add code to it so that it will detect and perform short-circuiting. We will implement a function similar to $fold-right but having this signature:

    $fold-right-sc := function ($seq as item()*,
                                $zero as item()*,
                                $f as function(item(), item()*) as item()*,
                                $fGetPartial as function(*)
                               ) as item()*

The last of the function’s parameters $fGetPartial returns a new function that is the partial application of $f, when its 1st argument is set to the current member of the input sequence $seq. The idea is that whenever short-circuiting is possible, $fGetPartial returns not a function having one argument (arity 1), but a constant – a function with 0 arguments (arity 0).

If the arity of the so produced partial application is 0, then our code will immediately return with the value $f($currentItem).

Here is the complete code of $fold-right-sc:

 let $fold-right-sc-inner := function ($seq as item()*,
                                       $zero as item()*,
                                       $f as function(item(), item()*) as item()*,
                                       $fGetPartial as function(*),
                                       $self as function(*)
                                      ) as item()*
{
  if(empty($seq)) then $zero
    else
      if(function-arity($fGetPartial(head($seq), $zero)) eq 0)
        then $fGetPartial(head($seq), $zero) ()
        else $f(head($seq), $self(tail($seq), $zero, $f, $fGetPartial, $self))
},

    $fold-right-sc := function ($seq as item()*,
                                $zero as item()*,
                                $f as function(item(), item()*) as item()*,
                                $fGetPartial as function(*)
                               ) as item()*
{
  $fold-right-sc-inner($seq, $zero, $f, $fGetPartial, $fold-right-sc-inner)
},

   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {if($x eq 0) then 0 else $x * $y},
   $fMultGetPartial := function($x, $y)
   {
     if($x eq 0)
       then function() {0}
       else function($z) {$x * $z}
   }

   return
     $fold-right-sc((1 to 1000000) ! function($x){$x - 3}(.), 1, $fMult, $fMultGetPartial)

Do note:

  1. If the current item (the head of the sequence) is 0, then $fMultGetPartial returns a function with 0 arguments (constant) that produces 0.

  2. $fold-right-sc (inner) treats differently a partial application of arity 0 from a partial application with arity 1. In the former case it simply produces the expected constant value without recursing further. Here is the relevant code fragment

  if(empty($seq)) then $zero
    else
      if(function-arity($fGetPartial(head($seq), $zero)) eq 0)
        then $fGetPartial(head($seq), $zero) ()
        else $f(head($seq), $self(tail($seq), $zero, $f, $fGetPartial, $self))

And now BaseX has no problems with the evaluation, even though the input sequence is of size 1M. The complete evaluation takes just a fraction of a millisecond (0.04 ms):

image

With Saxon things are not so good. Even though Saxon produces the correct result, evaluating the expression with an input sequence of size 1M takes 0.5 seconds (half a second), and evaluating the expression with an input sequence of 10M takes 5 seconds (10 times as long):

image

What is happening?

Even though Saxon performs much faster than the previous 261 seconds, due to detecting the short-circuiting possibility and performing the short-circuit, Saxon still processes all 10M items when evaluating this subexpression (which obviously the more optimized BaseX doesn’t do in advance):

(1 to 10000000) ! function($x){$x - 3}(.)

Therefore, we have one remaining problem: How to prevent long sequences (or arrays) from being fully materialized before starting the evaluation of $fold-right-sc ?

The fix: Step 3 – replacing collections with generators

Generators are well known and provided out of the box in many programming languages. Per Wikipedia:

“In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators.[1] A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.”

A full-fledged generator (such as implemented in C#) is an instance of a Finite State Machine(FSM), and implementing it in full generality goes beyond the topic and goals of this article. Expect another article soon that will provide this.

Here we will implement a simple kind of generator, that when passed an integer index $N, produces the $Nth item of a specific sequence. Although this is probably the simplest form of a generator, it can be useful in many cases and is a good illustrative solution to our current problem. The whole approach of replacing “something” with a function that must be called to produce this “something” is known as “lifting”

First, we will add to our $fold-right just the use of generators, without the detection and performing of short-circuiting:

let $fold-right-lifted-inner := function ($seqGen as function(xs:integer) as array(*),
                                    $index as xs:integer,
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $self as function(*)
                                   ) as item()*
                                {
                                  let $nextSeqResult := $seqGen($index),
                                      $isEndOfSeq :=  $nextSeqResult(1),
                                      $seqItem := $nextSeqResult(2)
                                    return
                                      if($isEndOfSeq) then $zero
                                        else
                                          $f($seqItem, $self($seqGen, $index+1, $zero, $f, $self))
                                },

    $fold-right-lifted := function ($seqGen as function(xs:integer) as array(*),
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* 
                                  ) as item()*
                                  {
                                    $fold-right-lifted-inner($seqGen, 1, $zero, $f, $fold-right-lifted-inner)
                                  },

   $NaN := xs:double('NaN'),

   $fSeq1ToN := function($ind as xs:integer, $indStart as xs:integer, $indEnd as xs:integer) as array(*)
                {
                  if($ind lt  $indStart or $ind gt $indEnd)
                    then  array{true(), $NaN}
                    else array{false(), $ind}
                },
   $fSeq-1-6 := $fSeq1ToN(?, 1, 6),

   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {$x * $y}

   return
     $fold-right-lifted($fSeq-1-6, 1, $fMult) 

Here we see an example of a simple generator – the function $fSeq1ToN.

This function returns an array with two members: a Boolean, which if true() indicates the end of the sequence, and the 2nd member is the current head of the simulated sequence.
The generator has two other parameters which are the values (inclusive) for the start-index and the end-index. Whenever the passed value of $ind is outside of this specified range, $fSeq1ToN returns a result array with its first member set to true() (the 2nd member of the result must be ignored in this case), which indicates end-of sequence.
Otherwise it returns array{false(), $ind} . It is the responsibility of the caller to stop calling the generator:

   $fSeq1ToN := function($ind as xs:integer, $indStart as xs:integer, $indEnd as xs:integer) as array(*)
                {
                  if($ind lt  $indStart or $ind gt $indEnd)
                    then  array{true(), $NaN}
                    else array{false(), $ind}
                }

Evaluating the complete XPath expression above produces the correct result both in BaseX and in Saxon: the product of the integers 1 to 6:

image

Now that we have successfully implemented the last missing piece of our complete solution, let us put everything together:

The fix: Step 4 – putting it all together

Finally we can replace the input sequence in \$fold-right-sc with a generator:

let $fold-right-sc-lifted-inner := function ($seqGen as function(xs:integer) as array(*),
                                    $index as xs:integer,
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $fGetPartial as function(*),
                                    $self as function(*)
                                   ) as item()*
                                {
                                  let $nextSeqResult := $seqGen($index),
                                      $isEndOfSeq :=  $nextSeqResult(1),
                                      $seqItem := $nextSeqResult(2)
                                    return
                                      if($isEndOfSeq) then $zero
                                        else
                                          if(function-arity($fGetPartial($seqItem, $zero)) eq 0)
                                            then $fGetPartial($seqItem, $zero) ()
                                            else $f($seqItem, $self($seqGen, $index+1, $zero, $f, $fGetPartial, $self))
                                },

    $fold-right-sc-lifted := function ($seqGen as function(xs:integer) as array(*),
                                       $zero as item()*,
                                       $f as function(item(), item()*) as item()*,
                                       $fGetPartial as function(*) 
                                      ) as item()*
                                      {
                                         $fold-right-sc-lifted-inner($seqGen, 1, $zero, $f, $fGetPartial, $fold-right-sc-lifted-inner)
                                      },

   $NaN := xs:double('NaN'),

   $fSeq1ToN := function($ind as xs:integer, $indStart as xs:integer, $indEnd as xs:integer) as array(*)
                {
                  if($ind lt  $indStart or $ind gt $indEnd)
                    then  array{true(), $NaN}
                    else array{false(), $ind}
                },
   $fSeq-1-6 := $fSeq1ToN(?, 1, 6),
   $fSeq-1-1M := $fSeq1ToN(?, 1, 1000000),
   $fSeq-1-1M-minus-3 := function($n as xs:integer)
   {
     array{$fSeq-1-1M($n)(1), $fSeq-1-1M($n)(2) -3}
   },

   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {$x * $y},
   $fMultGetPartial := function($x, $y)
   {
     if($x eq 0)
       then function() {0}
       else function($z) {$x * $z}
   }

   return
     $fold-right-sc-lifted($fSeq-1-1M-minus-3, 1, $fMult, $fMultGetPartial) 

Now this expression (and even one involving a sequence of 10M items take 0 seconds to be evaluated in both BaseX and Saxon, producing the correct result 0:

image


Summary

This article demonstrated the problems inherent to the standard XPath fn:fold-right and correctly determined the root causes for these problems: no short-circuiting and no collection generators.

Then a step-by-step solution was built that shows how to implement lazy evaluation in XPath based on short-circuiting and collection generators. This fixed the error raised by BaseX and dramatically reduced the evaluation time of Saxon from 261 seconds to 0 seconds.

The new function produced can be called $fold-lazy and is a good candidate for inclusion in the XPath 4.0 standard functions.

A complete design and implementation of a general collection-generator will be published in a separate article.

dnovatchev commented 1 year ago

It depends on the amount of memory one has and on the configuration of the Processor. We need to figure out how they avoid this in Haskell.

Haskell is infamous for having space leaks

Not for people who know what they are doing.

Let us not forget that such great data structures as the Finger Tree, which actually helps achieve universal O(log(N)) power on all common operations with (very long) lists were first defined and implemented in Haskell.

ChristianGruen commented 1 year ago

Actually, the stack overflow only happens in BaseX, and I just checked with really longer sequence that this still works in Saxon.

Good to know. It seems I need to switch to a newer version (Saxon-EE 10 and Saxon-HE 12.3 gives me StackOverflowError when running the query from https://github.com/qt4cg/qtspecs/issues/670#issuecomment-1704424043 unchanged, with the 749 limit, and with higher limits. @michaelhkay Has this been fixed in a more recent version?)

Edit: Maybe a greater stack size is used when running Saxon from within Oxygen. When I use -Xss100m, I can specify larger values than 749 (50000 and more) with both processors, but (@dnovatchev) the code gets pretty slow. I think we cannot expect a constant runtime, even though the short-circuiting takes place before “any work is done at all”?

dnovatchev commented 1 year ago

Actually, the stack overflow only happens in BaseX, and I just checked with really longer sequence that this still works in Saxon.

Good to know. It seems I need to switch to a newer version (Saxon-EE 10 and Saxon-HE 12.3 gives me StackOverflowError when running the query from #670 (comment) unchanged, with the 749 limit, and with higher limits. @michaelhkay Has this been fixed in a more recent version?)

Edit: Maybe a greater stack size is used when running Saxon from within Oxygen. When I use -Xss100m, I can specify larger values than 749 (50000 and more) with both processors, but (@dnovatchev) the code gets pretty slow. I think we cannot expect a constant runtime, even though the short-circuiting takes place before “any work is done at all”?

@ChristianGruen I would need to know how to configure BaseX, and Saxon in Oxygen, then I can play with different input sizes and see if there is any slow-down ( I doubt there is any reason within fold-lazy for that) and come up with further improvements.

I am using Oxygen 25.1, build 2023070306, and it says it is using Saxon 11.4.

image

dnovatchev commented 1 year ago

For all people who wanted to see an application of infinite collection generators, here is one of the functions that I will be using in producing the first N prime numbers:

take N xs in Haskell notation: given a number N and a list (that may be infinite) xs, this function produces the first N members of this list.

Here is the complete code (not using fold-lazy yet, as I have to find how exactly to define take using foldr):

let $INF := xs:double('INF'),
    $NaN := xs:double('NaN'),
    $fSeqM-to-N := function($ind as xs:integer, $indStart as xs:numeric, $indEnd as xs:numeric) as array(*)
                  {
                    if($ind lt  $indStart or $ind gt $indEnd)
                      then  array{true(), $NaN}
                      else array{false(), $ind}
                  },
     $fSeq-1-Infinity := $fSeqM-to-N(?, 1, xs:double('INF')),

     $fTakeInner := function($N as xs:integer, $index as xs:integer, $seqGen as function(xs:integer) as array(*), 
                             $self as function(*))
                    {
                        let $nextSeqResult := $seqGen($index),
                            $isEndOfSeq :=  $nextSeqResult(1),
                            $seqItem := $nextSeqResult(2)
                          return
                            if($isEndOfSeq or $N eq 0) then ()
                              else ($seqItem, $self($N -1, $index +1, $seqGen, $self))
                    },
     $fTake := function($N as xs:integer, $seqGen as function(xs:integer) as array(*))
               {
                 $fTakeInner($N, 1, $seqGen, $fTakeInner)
               }
  return
     $fTake(3, $fSeq-1-Infinity)

The function is passed N = 3 and an infinite collection generator that produces the infinite collection from 1 to ... (Infinity)

The wanted, correct result is produced:

1, 2, 3

image

ChristianGruen commented 1 year ago

For all people who wanted to see an application of infinite collection generators, here is one of the functions that I will be using in producing the first N prime numbers:

The generator might need to be fixed. For example, I havent’t got a result for:

$fSeqM-to-N(?, 2, xs:double('INF'))

I think the following version is correct. I’ve chosen XQuery as it’s more accessible to me. Similar to the original version, it should be tail-call optimized for larger sequences:

declare namespace hof = 'hof';

declare function hof:sequence($start, $end) {
  function($i) {
    let $value := $start + $i - 1
    return if($i > 0 and $value <= $end) then $value else ()
  }
};

declare function hof:take-inner($generate, $i, $c) {
  if ($c < 1) then () else (
    for $next in $generate($i)
    return ($next, hof:take-inner($generate, $i + 1, $c - 1))
  ) 
};

declare function hof:take($sequence, $count) {
  hof:take-inner($sequence, 1, $count)
};

hof:take(hof:sequence(2, xs:double('INF')), 1000)

Equivalent code (XPath 4), for conundrum aficionados:

let $seq:=fn($s,$e){fn{($s+.-1)[.>=$s][.<=$e]}}
let $take:=fn($s,$c){fn($g,$i,$c,$s){$g($i)[$c>0]!(.,$s($g,1+$i,-1+$c,$s))}!.($s,1,$c,.)}
return $take($seq(2,1 div 0e0),1000)

I’ll probably stick to subsequence($sequence, 1, 1000) or $sequence[position() <= 1000] ;·) The upcoming prime number example might convince me to use take, let’s see.

dnovatchev commented 1 year ago

Thanks @ChristianGruen for catching this!

Here is the fixed version:

let $INF := xs:double('INF'),
    $NaN := xs:double('NaN'),
     $fSeqM-to-N := function($ind as xs:integer, $indStart as xs:numeric, $indEnd as xs:numeric) as array(*)
                  {
                    if($ind lt  1 or $ind gt $indEnd)
                      then  array{true(), $NaN}
                      else array{false(), $indStart + $ind -1}
                  },
     $fSeq-2-Infinity := $fSeqM-to-N(?, 2, xs:double('INF')),

     $fTakeInner := function($N as xs:integer, $index as xs:integer, $seqGen as function(xs:integer) as array(*), 
                             $self as function(*))
                    {
                        let $nextSeqResult := $seqGen($index),
                            $isEndOfSeq :=  $nextSeqResult(1),
                            $seqItem := $nextSeqResult(2)
                          return
                            if($isEndOfSeq or $N eq 0) then ()
                              else ($seqItem, $self($N -1, $index +1, $seqGen, $self))
                    },
     $fTake := function($N as xs:integer, $seqGen as function(xs:integer) as array(*))
               {
                 $fTakeInner($N, 1, $seqGen, $fTakeInner)
               }
  return
     $fTake(3, $fSeq-2-Infinity)  

Now the correct result is produced for $fTake(3, $fSeq-2-Infinity) :

2, 3, 4

image

ChristianGruen commented 1 year ago

@dnovatchev and you may need to check the upper limit (try $fTake(3, $fSeqM-to-N(?, 10, 11)))

dnovatchev commented 1 year ago

Thanks again, @ChristianGruen

Now I think this is (finally) correct:

let $INF := xs:double('INF'),
    $NaN := xs:double('NaN'),
     $fSeqM-to-N := function($ind as xs:integer, $indStart as xs:numeric, $indEnd as xs:numeric) as array(*)
                  {
                    if($ind lt  1 or $ind + $indStart -1 gt $indEnd)
                      then  array{true(), $NaN}
                      else array{false(), $indStart + $ind -1}
                  },
     $fSeq-2-Infinity := $fSeqM-to-N(?, 2, xs:double('INF')),
     $fSeq-10-11 := $fSeqM-to-N(?, 10, 11),

     $fTakeInner := function($N as xs:integer, $index as xs:integer, $seqGen as function(xs:integer) as array(*), 
                             $self as function(*))
                    {
                        let $nextSeqResult := $seqGen($index),
                            $isEndOfSeq :=  $nextSeqResult(1),
                            $seqItem := $nextSeqResult(2)
                          return
                            if($isEndOfSeq or $N eq 0) then ()
                              else ($seqItem, $self($N -1, $index +1, $seqGen, $self))
                    },
     $fTake := function($N as xs:integer, $seqGen as function(xs:integer) as array(*))
               {
                 $fTakeInner($N, 1, $seqGen, $fTakeInner)
               }
  return
     ( $fTake(3, $fSeq-2-Infinity), '&#xA;====================', $fTake(3, $fSeq-10-11))

produces correctly:

2
3
4

====================
10
11
dnovatchev commented 1 year ago

I’ll probably stick to subsequence($sequence, 1, 1000) or $sequence[position() <= 1000] ;·)

These cannot be used with infinite (produced incrementally by a generator) collections, this is why we need a version of take that gets a generator to provide its input. It does what is specified above, plus it correctly works in the case of an infinite collection.

ChristianGruen commented 1 year ago

These cannot be used with infinite (produced incrementally by a generator) collections, this is why we need a version of take that gets a generator to provide its input. It does what is specified above, plus it correctly works in the case of an infinite collection.

Sure. I just referred to the existing example, which does nothing else than return a limited subsequence of integers.

rhdunn commented 1 year ago

There are several cases here:

  1. what (if anything) prevents an implementation making use of lazy evaluation, generators/streams, etc. for the existing functions -- this is a somewhat theoretical question not tied to any current implementation (or performance thereof), but what is actually possible in the language;
  2. the ability to produce infinite/very large sequences -- existing examples are fn:random-number-generator and the range operator to (e.g. 1 to 1000000) -- some of which (like random numbers) have/track state to generate the next value;
  3. the ability to defer/yield execution to the calling context -- e.g. with python's yield keyword.

I can see where making cases 2 and 3 easier and simpler for the person writing the query would be useful. It would also have the advantage of making fn:random-number-generator easier to use.

For case 1, I don't see the benefit of creating a new function that just allows optimizations that is fundamentally equivalent to a different function. I'd also like to have concrete implementation experience/guidance that providing the necessary optimizations and performance is/is not possible with the existing functions. I'd then like to understand what is preventing those optimizations and addressing that in a more applicable way than creating a design that exposes implementation details into the language.

dnovatchev commented 1 year ago

There are several cases here:

  1. what (if anything) prevents an implementation making use of lazy evaluation, generators/streams, etc. for the existing functions -- this is a somewhat theoretical question not tied to any current implementation (or performance thereof), but what is actually possible in the language;

Not sure I understand this question. Generators are a new feature and are by design just functions. This has no impact whatsoever on the language.

At present there is nothing in the XPath specification that even defines short-circuiting and lazy evaluation, and this seems to remain so in the future.

What is proposed here is:

  1. Define a function similar to fn:fold-right, that in addition takes a parameter that identifies any values from the input's value space, on which short-circuiting is possible. And,

  2. Have this function accept a generator instead of an actual collection. The function internally calls the generator on "as needed basis" and gets a single new input member on each call. This allows not to materialize (even not to have) any input collection (such as a sequence or an array) and this is similar to fusion in functional programming.

Currently-existing functions cannot use generators (and don't need to, as they by design are not intended to handle collections of unlimited (possibly infinite) size.

Newly written functions can have parameters that are generators.

  1. the ability to produce infinite/very large sequences -- existing examples are fn:random-number-generator and the range operator to (e.g. 1 to 1000000) -- some of which (like random numbers) have/track state to generate the next value;

Yes, this is one good use-case

  1. the ability to defer/yield execution to the calling context -- e.g. with python's yield keyword.

This would require some deep changes to the language and would cause a significant challenge to any implementation of the language. What is proposed here is a simpler and lighter alternative: just to define a generator as a function, that is the Façade of an FSM (Finite State Machine) that when called produces its next result, that includes its new (next) state.

This can be done immediately, even in plain XPath 3, as demonstrated and doesn't require any changes to the language.

I can see where making cases 2 and 3 easier and simpler for the person writing the query would be useful. It would also have the advantage of making fn:random-number-generator easier to use.

For case 1, I don't see the benefit of creating a new function that just allows optimizations that is fundamentally equivalent to a different function.

No, there is no way to guarantee laziness (detecting and utilizing possibilities for short-circuitness and manipulating a collection of unknown (possibly infinite) size ) with the current set of standard functions, or in the XPath language itself.

So, the benefit is obvious: have a function which when used produces the correct result without raising exceptions or waiting indefinitely as opposed to what we have at present, where using fn:fold-right results in exceptions or indefinite processing, and "this is not a bug, because this behavior doesn't violate the specification"

I'd also like to have concrete implementation experience/guidance that providing the necessary optimizations and performance is/is not possible with the existing functions.

See above.

I'd then like to understand what is preventing those optimizations

See above

and addressing that in a more applicable way than creating a design that exposes implementation details into the language.

Having the proposed function doesn't in any way "expose implementation details into the language". The language remains completely unaffected.

rhdunn commented 1 year ago

@dnovatchev I mean that given an XPath expression like you have in this example, are the performance/memory/etc. issues due to a) an implementation like BaseX not optimizing the query sufficient to avoid the error, or b) something in the specification that is preventing the query from being optimized.

For example, your original Haskell code would be expressed simply as something like:

fn:fold-right(
    1 to 1000000 ! (. - 15),
    1,
    function ($x, $y) { if ($x eq 0) then 0 else $x*$y })

My question is this: is there anything in XPath that causes this to not be optimized -- in this case as a result of range analysis, or similar techniques performed to simplify the expression before evaluation. That is, it will know that the input is -14 to 9999995 in this example. An optimizer could then deduce that fn:fold-right will be called with $x eq 0 and that the resulting expression is 0 without doing any computation on the values -- I suspect that this is what the Haskell compiler is doing, similar to the optimization for things like sum(1 to 1000000) that BaseX can do.

fold-left vs fold-right

If the optimizer does not do this, using fold-right instead of fold-left in this case is bad performance wise as the terminating condition (when x is 0) is toward the end of the sequence as fold-right traverses the sequence in reverse! Therefore, a generator or lazy termination will not help the performance in this specific case as they will still have to generate all 9999995 values before getting to 0. -- If you used fold-left instead, those could help w.r.t. performance, as it only needs to evaluate the first 15 values.

If an implementation is using a recursive fold-left/fold-right implementation, this can easily exceed the stack size as observed.

Likewise, because the fold-right is evaluating right to left (reversed), it is easy to run into a numeric overflow. Lazy evaluation or short circuiting will not solve this as the data is being evaluated (folded) right to left per the requirement of the specification for fold-right.

You can only avoid these if the implementation can statically deduce that $x will at some point become 0 and that $x*$y is the only operation being performed on the numbers in the fold operation.

sequences vs arrays

Using arrays instead of sequences here is also complicating things. This is because the default behaviour of XPath implementations using arrays is to construct them directly instead of lazily generating them. -- I suspect that it could be possible to create a lazy array implementation, but I don't know of a current XPath processor that does this.

dnovatchev commented 1 year ago

My question is this: is there anything in XPath that causes this to not be optimized -- in this case as a result of range analysis, or similar techniques performed to simplify the expression before evaluation. That is, it will know that the input is -14 to 9999995 in this example. An optimizer could then deduce that fn:fold-right will be called with $x eq 0 and that the resulting expression is 0 without doing any computation on the values -- I suspect that this is what the Haskell compiler is doing, similar to the optimization for things like sum(1 to 1000000) that BaseX can do.

@rhdunn This is just one example. In practice there may be many different operations, each with a set of "fixed points" about which no optimizer could be aware. Thus it is not possible to have the optimizer detect and utilize any possibility for short-circuiting - in the general case.

So the right question is different: Is there anything in XPath that allows short-circuiting possibilities to be detected and utilized?

And the answer is negative.

This is the same even in the case of Haskell -- see the existing implementations of Haskell's take n function that uses foldr and you'll see that a hint must be provided, because foldr knows nothing about the value of take's argument n and that exactly this value must cause shortcutting

fold-left vs fold-right

If the optimizer does not do this, using fold-right instead of fold-left in this case is bad performance wise as the terminating condition (when x is 0) is toward the end of the sequence as fold-right traverses the sequence in reverse! Therefore, a generator or lazy termination will not help the performance in this specific case as they will still have to generate all 9999995 values before getting to 0. -- If you used fold-left instead, those could help w.r.t. performance, as it only needs to evaluate the first 15 values.

This is generally not true. Just do your homework and read the available resources. fold-left cannot do short-cutting and needs special hints for when the "whole work is done" in order to halt further computation.

Please, ask @ChristianGruen about this 😄

If an implementation is using a recursive fold-left/fold-right implementation, this can easily exceed the stack size as observed.

This only depends on the configuration of the XPath processor, and is confirmed not to be generally true... If for example the heap size is not sufficiently big, this can also "easily" cause OutOfMemory exceptions

Likewise, because the fold-right is evaluating right to left (reversed), it is easy to run into a numeric overflow. Lazy evaluation or short circuiting will not solve this as the data is being evaluated (folded) right to left per the requirement of the specification for fold-right.

Completely false!

In fact fold-right has access to each input member and can perform shortcutting immediately upon encountering the value for this, without traversing the remainder of the list. This is how infinite inputs are handled. It is fold-left that cannot (by nature of its definition/semantics) recognize and utilize short-cutting.

Again, please, do your homework.

You can only avoid these if the implementation can statically deduce that $x will at some point become 0 and that $x*$y is the only operation being performed on the numbers in the fold operation.

Not true!

In many cases a short-cutting possibility is only created dynamically -- one example is the value n in take n, which can be known only after starting the evaluation of take. In this case, take dynamically creates a short-cutting hint that it passes in its call to foldr.

sequences vs arrays

Using arrays instead of sequences here is also complicating things. This is because the default behaviour of XPath implementations using arrays is to construct them directly instead of lazily generating them. -- I suspect that it could be possible to create a lazy array implementation, but I don't know of a current XPath processor that does this.

We will need generators for any kind of collection - both sequences and arrays, and more for arrays than for sequences, because sequences are flat and cannot contain a (identifiable) value that itself is a sequence.

michaelhkay commented 1 year ago

Again, please, do your homework.

Dimitre, please try to be more courteous.

Let me tell you a little story. I once made a proposal to a standards group that I had recently joined, and it was immediately accepted. Afterwards, the chair took me aside and told me that before I joined, the same proposal had been put forward by someone else, and had been rejected after a couple of hours' acrimonious debate. He told me (he was an industrial psychologist by training) that the only difference was that I had treated people with contrary opinions with respect and courtesy, and the previous proposer had not.

When you treat people disrespectfully, it is very hard to get them to accept your views.

rhdunn commented 1 year ago

@dnovatchev https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-fold-right states:

Processes the supplied sequence from right to left, applying the supplied function repeatedly to each item in turn, together with an accumulated result value.

The array version https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-array-fold-right states:

Evaluates the supplied function cumulatively on successive values of the supplied array.

Technically this is incorrect (it is a copy of the array:fold-right text) -- it should state that it evaluates in reverse order. This is in line with the rules which define it as:

array:members($array) => fn:fold-right($zero, function($a, $b) { $action($a, $b?value }) 

Nowhere does this specification state that it has access to each input member at once. I.e. that it can determine that there is at least one item with a value of 0 when folding over a multiplication operation.

The only way for it to know that is to keep the values as a range , perform range analysis to infer the behaviour, then use that knowledge to simplify the operation. This is the only way a compiler like the Haskell compiler can evaluate this in constant time.

BaseX will already short-circuit the evaluation of fn:sum(1 to 100) etc. to use simpler expressions, and in that case evaluate it at compile time using mathematical equivalences. Therefore, what makes fold-right different in that regard?

rhdunn commented 1 year ago

Note: I'm not saying that generators aren't useful, they are. I'm saying that they won't help in this case as the 0 occurs near the end of the generated sequence, not the start due to using fold-right.

And yes, an optimizer will not always detect and optimize code -- there are example cases in C/C++ and JavaScript in providing hints to the optimizers, and performance benchmarks constantly fluctuate. But generators won't help if e.g. you use them to construct an array.

Lazy evaluation should be something that the language enables without the user having to write more complicated code to say that the code can be lazily evaluated. For example, you could have a processor that makes use of GPUs and vectorized instruction pipelines to process data in parallel. You don't need a fn:parallel-fold-left to enable that (or similar for similar operation).

dnovatchev commented 1 year ago

@dnovatchev https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-fold-right states:

Processes the supplied sequence from right to left, applying the supplied function repeatedly to each item in turn, together with an accumulated result value.

The array version https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-array-fold-right states:

Evaluates the supplied function cumulatively on successive values of the supplied array.

Technically this is incorrect (it is a copy of the array:fold-right text) -- it should state that it evaluates in reverse order. This is in line with the rules which define it as:

Again, this is not true.

Here is the semantics of the function as defined in the XPath 3.1 Spec:

declare function fn:fold-right(
        $seq as item()*, 
        $zero as item()*, 
        $f as function(item(), item()*) as item()*) 
        as item()* {
  if (fn:empty($seq))
  then $zero
  else $f(fn:head($seq), fn:fold-right(fn:tail($seq), $zero, $f))
};

Note the last line:

else $f(fn:head($seq), fn:fold-right(fn:tail($seq), $zero, $f))

this tells us that at every step traversing towards the end of the sequence, the function $f has to be provided (by the implementation of fold-right) with the current item at the head of the sequence, thus fold-right must (not can) access the current item of the sequence at every traversal step, and can evaluate it and decide that it could be a good opportunity to stop traversing the sequence further.

Such decision is not possible with the semantics/definition of fold-left

Lazy evaluation should be something that the language enables without the user having to write more complicated code to say that the code can be lazily evaluated. For example, you could have a processor that makes use of GPUs and vectorized instruction pipelines to process data in parallel. You don't need a fn:parallel-fold-left to enable that (or similar for similar operation).

Agreed, not "should be" but "could be", but at what cost?

In Bulgaria in the 70-ties they taught us that "Communism will be built by 1980" and then valuable things "will start flowing like a full river to everyone"

Things don't happen in one Big Bang, usually they happen step by step, and this proposal is for one such meaningful and realistically achievable step.

Note: I'm not saying that generators aren't useful, they are. I'm saying that they won't help in this case as the 0 occurs near the end of the generated sequence, not the start due to using fold-right.

When the input is infinite there is no such thing as "near the end of the sequence" and using a generator combined with short-circuiting is in fact a powerful way to still do meaningful processing of such infinite input.

rhdunn commented 1 year ago

@dnovatchev The following produces -6 in each case:

fn:fold-left(function ($x, $y) { $x - $y }, 0, 1 to 3),
(((0 - 1) - 2) - 3),
let $x1 := 0 - 1
let $x2 := $x1 - 2
let $x3 := $x2 - 3
return $x3

i.e. the sequence is evaluated left to right.

The following produces 2 in each case:

fn:fold-right(function ($x, $y) { $x - $y }, 0, (1, 2, 3)),
1 - (2 - (3 - 0)),
let $x1 := 3 - 0
let $x2 := 2 - $x1
let $x3 := 1 - $x2
return $x3

i.e. the sequence is evaluated right to left.

In each of these, the third example is how you could implement these iteratively. The specification is just defining which side the values and the accumulated results are, and the direction the values are evaluated. That it is using a tail recursive implementation does not matter, as that is equivalent to an iterative approach which for fold-right (as in my examples) evaluates the results in reverse.

The specification is there to define the observed behaviour/results, not prescribe implementation details. I.e. the specification isn't saying here that an iterative fold-left/right is non-conforming as long as it produces the correct results/behaviour.

michaelhkay commented 1 year ago

Firstly, fold-left and fold-right are symmetric.

If there is an asymmetry, it is because many sequences are designed to be processed from left-to-right. But some sequences (for example. 1 to 100000, or child:: or // in a DOM) can be processed equally well in either direction, and a few sequences like preceding:: are much easier to evaluate from right-to-left, but there is nothing in the spec to force this, for example fold-right(//, ...) or fold-right(1 to 1000000, ...) can easily be implemented as a right-to-left sequence iteration. I guess most implementations will evaluate //* with a left-to-right tree walk, populating the result sequence from left-to-right as they go, but there is nothing intrinsically to stop a processor doing the reverse: and with Saxon, when you apply fold-right() to a sequence that supports right-to-left evaluation, such as (1 to 1000000) that's exactly what will happen.

I suspect this is an area where XPath differs from Haskell: Haskell as far as I can tell has a strong left-to-right bias.

I think one can classify early-exit conditions of three kinds:

(a) where an individual item in the input sequence tells you you have reached a fixpoint. For example, functions like avg, max, and min return NaN if there is a NaN in the input. Such a fixpoint can be recognized if the callback function has the special form "if (predicate($x)) then $x else ....", where $x is the "next-item" argument (the second argument in the case of fold-left, the first argument in the case of fold-right)

(b) where the accumulated result tells you you have reached a fixpoint. The analysis is similar except this time the condition is on the other argument.

(c) where the termination conditions are more subtle, for example (as in some numerical methods) where the difference between successive results is getting smaller and smaller.

Case (c) is best left to the application, and is probably best tackled using iterate-while().

The difference between cases (a) and (b) is that in case (a) you can potentially detect the opportunity for a quick exit without ever calling the callback function at all. If NaN is a fixpoint, start by scanning the sequence to look for instances of NaN, and return NaN immediately if you find one. This possibility is particularly attractive if you have to pre-scan the sequence anyway in order to get to the other end (that is, if you are doing fold-right on a left-to-right sequence, or fold-left on a right-to-left one). And of course there are even more opportunities with a monotonic integer sequence as in Dimitre's first example because you can test whether a particular value is present in the sequence just by looking at the start and end values.

For an implementor the question then becomes, which of these optimisations worth doing? Just because an optimization is possible doesn't mean it is worth doing. For example, it is easy to optimize sum(1 to N), but do such expressions arise often enough to make them worth detecting? In many cases I think the answer is no: remember that every time you ask whether a 1-in-a-million optimization pattern is present, you are imposing a small overhead on the 999,999 cases where it isn't, and you soon reach the position where the average time spent optimizing exceeds the average execution time.

Michael Kay

On 5 Sep 2023, at 22:19, Reece H. Dunn @.***> wrote:

@dnovatchev https://github.com/dnovatchev https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-fold-right https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-fold-right states:

Processes the supplied sequence from right to left, applying the supplied function repeatedly to each item in turn, together with an accumulated result value.

The array version https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-array-fold-right https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-array-fold-right states:

Evaluates the supplied function cumulatively on successive values of the supplied array.

Technically this is incorrect (it is a copy of the array:fold-right text) -- it should state that it evaluates in reverse order. This is in line with the rules which define it as:

array:members($array) => fn:fold-right($zero, function($a, $b) { $action($a, $b?value }) Nowhere does this specification state that it has access to each input member at once. I.e. that it can determine that there is at least one item with a value of 0 when folding over a multiplication operation.

The only way for it to know that is to keep the values as a range , perform range analysis to infer the behaviour, then use that knowledge to simplify the operation. This is the only way a compiler like the Haskell compiler can evaluate this in constant time.

BaseX will already short-circuit the evaluation of fn:sum(1 to 100) etc. to use simpler expressions, and in that case evaluate it at compile time using mathematical equivalences. Therefore, what makes fold-right different in that regard?

— Reply to this email directly, view it on GitHub https://github.com/qt4cg/qtspecs/issues/670#issuecomment-1707323415, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASIQIWCDXA6SAGYQQTNPDTXY6JMXANCNFSM6AAAAAA4IL3DME. You are receiving this because you were mentioned.

rhdunn commented 1 year ago

@michaelhkay That's why modern Java and JavaScript engines use multi-stage compilation. -- They will often generate code using the simplest possible representation, then run more complex optimizations in the background. That way, if the execution is fast you don't get that optimization penalty, but if the optimization pass completes then the faster version can take over.

dnovatchev commented 1 year ago

The specification is there to define the observed behaviour/results, not prescribe implementation details. I.e. the specification isn't saying here that an iterative fold-left/right is non-conforming as long as it produces the correct results/behaviour

No, the code in the Specification specifies the semantics of the function.

This alone makes it possible to perform short-cutting within fold-right before the input is traversed to the end, and makes such short-cutting impossible with fold-left.

If someone changes this code, they are violating the specification and the semantics that it specified.

And this is not arbitrary code -- it closely mirrors the definition of the corresponding two Haskell functions: foldl and foldr.

The following produces 2 in each case:

fn:fold-right(function ($x, $y) { $x - $y }, 0, (1, 2, 3)),
1 - (2 - (3 - 0)),
let $x1 := 3 - 0
let $x2 := 2 - $x1
let $x3 := 1 - $x2
return $x3

i.e. the sequence is evaluated right to left.

There is no ground to make any such conclusion. Nothing prevents the processor to evaluate the leftmost subexpression (1), and then the RHS (2 - (3 - 0)) , and finally perform the subtraction in this first step as quoted from above:

1 - (2 - (3 - 0)),

dnovatchev commented 1 year ago

For an implementor the question then becomes, which of these optimisations worth doing? Just because an optimization is possible doesn't mean it is worth doing. For example, it is easy to optimize sum(1 to N), but do such expressions arise often enough to make them worth detecting? In many cases I think the answer is no: remember that every time you ask whether a 1-in-a-million optimization pattern is present, you are imposing a small overhead on the 999,999 cases where it isn't, and you soon reach the position where the average time spent optimizing exceeds the average execution time.

Great analysis, @michaelhkay , thank you!

This is exactly why I am proposing to implement this improvement not as a language-wide feature, that may impact all code of everyone, but as one or few functions that will only be used if the programmer (user) consciously selects to use them.

liamquin commented 1 year ago

No, the code in the Specification specifies the semantics of the function.

The spec (using F&O 3.1) actually says, “The effect of the function is equivalent to the following implementation in XQuery:”

That is, with the exception of errors, the result should be the same as if the given XQuery implementation was used, but you can't go from that to placing requirements on an actual implementation beyond producing the same results, in the absence of errors.

I say “should” because in some cases optimization may produce different results, as the spec permits - most likely in cases such as rounding or in whether side-effecting functions are called or not.

michaelhkay commented 1 year ago

This alone makes it possible to perform short-cutting within fold-right before the input is traversed to the end, and makes such short-cutting impossible with fold-left.

You keep assuming that all sequences are traversed left-to-right.

liamquin commented 1 year ago

I guess most implementations will evaluate //* with a left-to-right tree walk, populating the result sequence from left-to-right as they go, but there is nothing intrinsically to stop a processor doing the reverse:

I know of at least one widely-used commercial processor that tends to evaluate XPath expressions from right to left, because this tends to work best with its index structure. When the person in the XQuery Working Group mentioned this, at least some others said, “We do that too”.

If i ever finished the stuff i was working on with adding XPath support to the text retrieval system i wrote, it would look at /a/b/c and start with whichever one occurred the fewest times, working outwards (which is what it does to match phrases).

dnovatchev commented 1 year ago

This alone makes it possible to perform short-cutting within fold-right before the input is traversed to the end, and makes such short-cutting impossible with fold-left.

You keep assuming that all sequences are traversed left-to-right.

There is no other way for infinite collections 😄

dnovatchev commented 1 year ago

No, the code in the Specification specifies the semantics of the function.

The spec (using F&O 3.1) actually says, “The effect of the function is equivalent to the following implementation in XQuery:”

That is, with the exception of errors, the result should be the same as if the given XQuery implementation was used, but you can't go from that to placing requirements on an actual implementation beyond producing the same results, in the absence of errors.

I say “should” because in some cases optimization may produce different results, as the spec permits - most likely in cases such as rounding or in whether side-effecting functions are called or not.

Well, they mirrored the actual code of Haskell's foldl and foldr, and there might be a reason for that? 😄

dnovatchev commented 1 year ago

I guess most implementations will evaluate //* with a left-to-right tree walk, populating the result sequence from left-to-right as they go, but there is nothing intrinsically to stop a processor doing the reverse:

I know of at least one widely-used commercial processor that tends to evaluate XPath expressions from right to left, because this tends to work best with its index structure. When the person in the XQuery Working Group mentioned this, at least some others said, “We do that too”.

If i ever finished the stuff i was working on with adding XPath support to the text retrieval system i wrote, it would look at /a/b/c and start with whichever one occurred the fewest times, working outwards (which is what it does to match phrases).

Again, when traversing an infinite collection, "right to left" is not possible.

liamquin commented 1 year ago

Again, when traversing an infinite collection, "right to left" is not possible.

Doesn't this assume leftwards-infinite is the only possibility? And yet, there's no intrinsic reason why you can't have a generator that accepts negative values. E.g. a sequence of integers goes as far as you can see in either direction.

Leftwards-unbounded sequences can be useful for thinking about history, for example (e.g. revisions).

However, my comment about evaluating XPath expressions right to left was really to Mike Kay.

dnovatchev commented 1 year ago

Again, when traversing an infinite collection, "right to left" is not possible.

Doesn't this assume leftwards-infinite is the only possibility? And yet, there's no intrinsic reason why you can't have a generator that accepts negative values. E.g. a sequence of integers goes as far as you can see in either direction.

Leftwards-unbounded sequences can be useful for thinking about history, for example (e.g. revisions).

It seems you are mixing here left to right with "producing with increasing value"

This is not what left to right means. Something as:

-5, -10, -22, ... , -99999, ...

Is still left to right.

even this:

1, -1, 3, -3, 15, -20 , ... 789, -953, ...

is still left to right.

Left means from the start, right means towards the unreachable end.

All known sequences have some kind of generator (either a formula or a recurrence generator). Left is the first value that the generator produces, and all consequent values are "right" of this.

However, my comment about evaluating XPath expressions right to left was really to Mike Kay.

Even so, this was an interesting comment and probably a FAQ that needs to be explained correctly, thus thank you!

liamquin commented 1 year ago

I should have used an example. There's nothing inherently impossible with having (... -5, -4, -3, -2, -1, 0, 1, 2, 3, 4 ,5 ...) as a sequence.

I've seen languages deal with this by requiring a starting point and an enumeration like (0, -1, 1, -2, 2, ...) in which case yes it becomes only right-infinite. But there doesn't have to be a "start".

Whether this is useful i'm not sure; i've used programming languages that supported the concept, though.

dnovatchev commented 1 year ago

I should have used an example. There's nothing inherently impossible with having (... -5, -4, -3, -2, -1, 0, 1, 2, 3, 4 ,5 ...) as a sequence.

I've seen languages deal with this by requiring a starting point and an enumeration like (0, -1, 1, -2, 2, ...) in which case yes it becomes only right-infinite. But there doesn't have to be a "start".

Whether this is useful i'm not sure; i've used programming languages that supported the concept, though.

It could be as useful as watching a movie from a random point. Still, anything from the start of watching can be ordered / enumerated. And this is the "real start" that matters to the observer.

michaelhkay commented 1 year ago

There is no other way for infinite collections 😄

Not all collections are infinite.

ChristianGruen commented 1 year ago

What an ample discussion. I’m pretty sure I’ll repeat others’ observations:

This is generally not true. Just do your homework and read the available resources. fold-left cannot do short-cutting and needs special hints for when the "whole work is done" in order to halt further computation. Please, ask @ChristianGruen about this 😄

@dnovatchev Christian agrees with @rhdunn that short-cutting is possible, for both left and right folds; he just added it to his implementation.

In the discussion, plenty of different concepts and perspectives are mixed up, which leads to misunderstanding. Short-circuiting, short-cuttting, early-exit strategies and fixpoint computations, that’s all related, but not identical. Similarly, a) the specification, b) the proposed XPath solution and c) Haskell are mixed up.

Regarding b) and c), short-circuiting and foldr comes with costs. You have to ponder whether you want to exchange the time for iterating data against memory consumption (heap, stack) and initial overhead. The concept is certainly enticing, but there are reasons why most programming languages only support left folds.

No, the code in the Specification specifies the semantics of the function.

It doesn’t, Dimitre. The specification presents a recursive solution simply because it’s the way how it can be formulated in XPath/XQuery. It doesn’t prescribe the way to implement it. As I wrote, I’m not aware of any implementation that has chosen the recursive approach.

Again, when traversing an infinite collection, "right to left" is not possible.

If you reverse the infinite sequence 0 .. INF, right-to-left traversal will be the only option.

I know of at least one widely-used commercial processor that tends to evaluate XPath expressions from right to left, because this tends to work best with its index structure. When the person in the XQuery Working Group mentioned this, at least some others said, “We do that too”.

@liamquin We deploy both left-to-right and right-to-left strategies, depending on the function and the (runtime availability of the full) input data. Right-to-left access can also be hidden further below in the evaluation tree. An example is head(reverse($data)): We use a left-to-right iterator to retrieve the first item of the argument, but the argument is a right-to-left iterator for $data.

When I think about it, we should probably optimize our fn:fold-right implementation for iterative right-to-left access. Currently, it uses a right-to-left iterator over the resulting finger-trie representation of the (fully computed) sequence.

I think one can classify early-exit conditions of three kinds: […]

@michaelhkay Thanks for the fine summary. Regarding the question of which optimizations are worth doing, I would add that folds are used by just a small handful of developers, which are usually advanced enough to choose a language concept that suits best to a given problem – and folds won’t be the first choice if you plan not to iterate over the full input. One may encounter cases, though, in which a recursive approach is hard to grasp, or difficult to optimize for tail-calls. If implementations provide early exits for the most common patterns (personally I think kind (b) is much more common, but we might learn more about it), folds could get a viable choice, even more if the underlying implementation is iterative. fn:iterate-while often works best if the input is not a sequence.

rhdunn commented 1 year ago

One approach w.r.t. optimizations I've been thinking about is to have a special case class for the range operator to. I.e. m to n would produce something like a RangeSequence(m, n, step = 1) that is internally a lazy sequence (using a generator when the values are needed). That way, you gain the performance via the special cased implementations without the need for compiler magic -- like how subsequence(start, end) on a list in Java, etc. is typically optimized by creating an offset view of the underlying list wihout the need to copy the data.

That way, operations on the entire sequence (e.g. $s ! (. + 5)) would just transform the sequence -- e.g. s.add(5) resulting in a RangeSequence(m+5, n+5). This should be a safe transform and wouldn't need any special compiler magic to perform -- the processor would just call add on the sequence and the specialized implementation would handle this optimization.

You could even handle filtering operations that create slices -- e.g. (1 to 1000)[position() = 5 to 10] to create a subrange. -- The processor would just need to detect the sequence[position = range] pattern and call sequence.subsequence(range) instead of constructing a new sequence, or calling a more general sequence.filter(filter_function).

Expressions like array { 1 to 1000 } -- or more generally array construction on a RangeSequence -- could construct a RangeArray class. I also wonder if it makes sense to have something like a SequenceArray that is an array adapter of a sequence. Both of these would have the corresponding array-style error checking instead of the sequence-style behaviour. They would also have copy-on-write semantics so that a new array is constructed when adding or removing items to/from the array. This also has the advantage that array:members($array) is trivial in this case -- just return the internal sequence object.

Reverse is also trivial to implement, either by RangeSequence(n, m, step = -step) or by using a reverse iterator. -- I've used this trick when traversing node trees, where e.g. sibling/child iteration switches from nextSibling to prevSibling.

I've not yet tried implementing this approach, but it is something I want to get to at some point. Feel free to use these ideas in your own implementations if you want.

rhdunn commented 1 year ago

@dnovatchev I think I understand what you are saying.

For $f := function ($x, $y) { if ($x eq 0) then 0 else $x * $y }, at the point where you have:

  1. $x := 0 and
  2. $y := fn:fold-right(fn:tail($seq), $zero, $f)

you could (if using a recursive instead of an iterative fold-right) determine that the result of $f($x, $y) will be 0 without evaluating $y. It would not be easy, as it would not be something the processor could determine statically, so would have to be something at runtime and would only work for a recursive fold right implementation.

You also have to be careful about the order of operations -- in my subtraction example, the sign of the result flips with each value, and depends on the results being accumulated in reverse order. Even with the recursive implementation, the results will be expanded left to right, but evaluated ("folded") right to left. So, you have:

let $f := function ($x, $y) { $x - $y }
return fn:fold-right($f, 0, (1, 2, 3))

= $f(1, fn:fold-right($f, 0, (2, 3)))
= $f(1, $f(2, fn:fold-right($f, 0, (3))))
= $f(1, $f(2, $f(3, fn:fold-right($f, 0, ()))))
= $f(1, $f(2, $f(3, 0)))
= $f(1, $f(2, 3))
= $f(1, -1)
= 2
ChristianGruen commented 1 year ago

@rhdunn Thanks for your thoughts. Some feedback on BaseX:

That way, operations on the entire sequence (e.g. $s ! (. + 5)) would just transform the sequence -- e.g. s.add(5) resulting in a RangeSequence(m+5, n+5). This should be a safe transform and wouldn't need any special compiler magic to perform -- the processor would just call add on the sequence and the specialized implementation would handle this optimization.

That’s indeed something we already do (and why e.g. max((1 to 100000000000000) ! (. + 2)) is evaluated in constant time).

You could even handle filtering operations that create slices -- e.g. (1 to 1000)[position() = 5 to 10] to create a subrange. -- The processor would just need to detect the sequence[position = range] pattern and call sequence.subsequence(range) instead of constructing a new sequence, or calling a more general sequence.filter(filter_function).

Same here: A subsequence is created for e.g. (1, 3, 2, 4, 3, 6)[position() = 2 to 3]. For integer ranges ((1 to 10)[position() = 2 to 3]), we create a new range sequence.

Expressions like array { 1 to 1000 } -- or more generally array construction on a RangeSequence -- could construct a RangeArray class.

Internally, we use the same data structure for sequences and arrays, so the performance is basically the same… However (I think you indicated that before), as arrays are single items, we don’t provide lazy array iterators. Instead, for the moment, an array needs to be fully computed before it can be traversed. Things may change if people decide to use arrays more often.

dnovatchev commented 1 year ago

One approach w.r.t. optimizations I've been thinking about is to have a special case class for the range operator to. I.e. m to n would produce something like a RangeSequence(m, n, step = 1) that is internally a lazy sequence (using a generator when the values are needed). That way, you gain the performance via the special cased implementations without the need for compiler magic -- like how subsequence(start, end) on a list in Java, etc. is typically optimized by creating an offset view of the underlying list wihout the need to copy the data.

@rhdunn This is how the generators are defined in this proposal. Glad you like them!

dnovatchev commented 1 year ago

Again, when traversing an infinite collection, "right to left" is not possible.

If you reverse the infinite sequence 0 .. INF, right-to-left traversal will be the only option.

@ChristianGruen We already discussed this with @liamquin . Any sequential traversal (isomorphic to 1 ..INF) is actually left to right, that is the same thing.

If we hate the term "left to right" we can equally well choose to rename it and use only "right to left", but this is a cosmetic change that doesn't touch the actual semantics that remains the same.

ChristianGruen commented 1 year ago

Any sequential traversal (isomorphic to 1 ..INF) is actually left to right, that is the same thing.

Please note that this is an implementation issue. It’s perfectly possible to write iterators that traverse sequences from right to left. For -INF..0, this iterator would first return 0. For 0..INF, obviously, no result can be attained.

With XPath, such an iterator could return a result for fn:foot(xs:double('INF') to 0) (if we supported infinite sequences with the range expression) in constant time, by returning the last (maximum) value and without traversing the rest.

Disclaimer: I don't say we should introduce infinite sequences that way.

dnovatchev commented 1 year ago

Any sequential traversal (isomorphic to 1 ..INF) is actually left to right, that is the same thing.

Please note that this is an implementation issue. It’s perfectly possible to write iterators that traverse sequences from right to left. For -INF..0, this iterator would first return 0. For 0..INF, obviously, no result can be attained.

With XPath, such an iterator could return a result for fn:foot(xs:double('INF') to 0) (if we supported infinite sequences with the range expression) in constant time, by returning the last (maximum) value and without traversing the rest.

Disclaimer: I don't say we should introduce infinite sequences that way.

This is just another way of saying that the traversal is essentially "left to right"

michaelhkay commented 1 year ago

Reece wrote:

One approach w.r.t. optimizations I've been thinking about is to have a special case class for the range operator to. I.e. m to n would produce something like a RangeSequence(m, n, step = 1) that is internally a lazy sequence

Saxon certainly does that (which means that observations about Saxon performance of operations on the sequence 1 to N don't generalise to other sequences).

That way, operations on the entire sequence (e.g. $s ! (. + 5)) would just transform the sequence -- e.g. s.add(5) resulting in a RangeSequence(m+5, n+5).

Saxon doesn't do that, it wouldn't be difficult, but it's more special-case code for a case that might arise rather rarely in practice.

You could even handle filtering operations that create slices -- e.g. (1 to 1000)[position() = 5 to 10] to create a subrange.

Saxon does that (the expression is first converted to a call on subsequence()).

Expressions like array { 1 to 1000 } -- or more generally array construction on a RangeSequence -- could construct a RangeArray class. I also wonder if it makes sense to have something like a SequenceArray...

Yes, I've been wondering about ways to achieve more commonality between the Saxon representations of arrays and sequences.

Reverse is also trivial to implement, either by RangeSequence(n, m, step = -step) or by using a reverse iterator

Saxon does that. Some Sequence classes, including this one, have an iterator implementation that allows iteration in either direction, rather like Java's ListIterator.

ChristianGruen commented 1 year ago

For the sake of completion (see https://github.com/qt4cg/qtspecs/issues/670#issuecomment-1704432898), and as a contribution to solve all other possible computer science problems via this very issue, here’s another prime number implementation, based on the Sieve of Eratosthenes. It seems to perform better for larger counts (it runs 5 seconds for the first 100,000 values), but it has a higher memory consumption due to the auxiliary map:

declare namespace map = 'http://www.w3.org/2005/xpath-functions/map';
declare namespace math = 'http://www.w3.org/2005/xpath-functions/math';

declare function local:sieve-rec($count, $current, $result, $est-max, $no-primes) {
  if (count($result) >= $count) then (
    $result
  ) else if ($no-primes($current)) then (
    local:sieve-rec($count, $current + 1, $result, $est-max, $no-primes)
  ) else (
    local:sieve-rec($count, $current + 1, ($result, $current), $est-max, fold-left(
      $current to $est-max idiv $current,
      $no-primes,
      function($set, $i) { map:put($set, $current * $i, true()) }
    ))
  )
};
declare function local:sieve-rec($count) {
  local:sieve-rec($count, 2, (), xs:integer(($count + 2) * math:log10($count) * 3), map {})
};
local:sieve-rec(100000)

Once again, runtimes will vary, based on your query processor.

dnovatchev commented 12 months ago

For the sake of completion, here is a solution to the problem of producing the first N primes.

Some interesting things to note:

let $fold-right-sc-lifted-inner := function ($seqGen as function(xs:integer, map(*)) as array(*),
                                    $index as xs:integer,
                                    $seqState as map(*),
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $fGetPartial as function(*),
                                    $self as function(*)
                                   ) as item()*
                                {
                                  let $nextSeqResult := $seqGen($index, $seqState),
                                      $isEndOfSeq :=  $nextSeqResult(1),
                                      $seqItem := $nextSeqResult(2),
                                      $nextState := $nextSeqResult(3)
                                    return
                                      if($isEndOfSeq) then $zero
                                        else
                                          if(function-arity($fGetPartial($seqItem, $zero)) eq 0)
                                            then $fGetPartial($seqItem, $zero) ()
                                            else $f($seqItem, $self($seqGen, $index+1, $nextState, $zero, 
                                                    $f, $fGetPartial, $self))
                                },

    $fold-right-sc-lifted := function ($seqGen as function(xs:integer, map(*)) as array(*),
                                       $startIndex,
                                       $seqState as map(*),
                                       $zero as item()*,
                                       $f as function(item(), item()*) as item()*,
                                       $fGetPartial as function(*) 
                                      ) as item()*
                                      {
                                         $fold-right-sc-lifted-inner($seqGen, $startIndex, $seqState, $zero,  
                                                                     $f, $fGetPartial, $fold-right-sc-lifted-inner)
                                      },

   $NaN := xs:double('NaN'),
   $INF := xs:double('INF'),

   $fSeqM-to-N := function($ind as xs:integer, $indStart as xs:numeric, $indEnd as xs:numeric) as array(*)
                  {
                    if($ind lt  1 or $ind + $indStart -1 gt $indEnd)
                      then  array{true(), $NaN, map{}}
                      else array{false(), $indStart + $ind -1, map{}}
                  },

   $fGen-1-Inf := $fSeqM-to-N(?, 1, $INF),

    $pow2mod-inner := function($power as xs:decimal, $modulus as xs:decimal, $self as function(*)) as xs:decimal
                      {
                        if($power eq 1) then 2
                          else
                            for $half in $self($power idiv 2, $modulus, $self)
                             return
                               ($half * $half mod $modulus) * (1 + ($power mod 2)) mod $modulus
                      },
    $pow2mod := function($power as xs:decimal, $modulus as xs:decimal) as xs:decimal
                {
                  $pow2mod-inner($power, $modulus, $pow2mod-inner)
                },

   $fIsPrime := function($N as xs:integer) as xs:boolean
   {
     if($N = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41)) then true()
       else if( exists((3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41)[$N mod . eq 0]) )
             then false()
       else if($pow2mod($N -1, $N) ne 1) then false()
       else empty(((43 to xs:integer(round(math:sqrt($N)))) ) 
                                                  [$N mod . eq 0])
   },
   $fZipper := function($ind as xs:integer, $gen1 as function(xs:integer) as array(*), 
                        $gen2 as function(xs:integer, map(*)) as array(*), $genState as map(*)) as array(*)
            {
              let $res1 := $gen1($ind), $res2 := $gen2($ind, $genState)
               return
                 if($res1(1) or $res2(1))
                   then [true(), $NaN, map{}]
                   else [false(), [$res1(2), $res2(2)], $res2(3)]
            },

   $fGenPrimes := function($ind as xs:integer, $lastState as map(*))
   {
     let $lastPrime := $lastState?lastPrime,
         $endRange := if($lastPrime gt 1) then 2 * $lastPrime -1
                        else 2,
         $nextPrime :=  if($lastPrime = (1, 2)) then $lastPrime + 1 
                          else  ( ($lastPrime + 1)   to  $endRange) [. mod 6 = (1, 5)]  [$fIsPrime(.)] [1]
      return
        [false(), $nextPrime, map{'lastPrime': $nextPrime} ]
   },

   $fStep := function($member as array(*), $currentResult as item()*)
             {
               ($member(2), $currentResult)
             },

   $fStepMakePartialTemplate := function($n as xs:integer, $member as array(*), $y as item()*)
   {
     if($member(1) gt $n)
       then function() { () }
       else function($y) { $member(2), $y}
   },   

   $fTake := function($someSeqGen as function(xs:integer, map(*)) as array(*), $n as xs:integer, $genState as map(*))
             {
               let $zippedSeqGen := $fZipper(?, $fGen-1-Inf, $someSeqGen, ?),
                   $fStepMakePartial := $fStepMakePartialTemplate($n, ?, ?)
                 return
                   $fold-right-sc-lifted( $zippedSeqGen, 1, $genState, (), $fStep, $fStepMakePartial )
             }
 return
   (     

      $fTake($fGenPrimes, 750000, map{'lastPrime': 1})
   )
dnovatchev commented 7 months ago

For the sake of completion, here is a solution to the problem of producing the first N primes.

Some interesting things to note:

  • This uses a take n function that itself is implemented using fold-right (fold-lazy).
  • There is nothing like a "fixed-point" in the infinite sequence passed to the take function. The values for which short-circuiting needs to be done (and the short-circuiting hint itself) are dynamically generated by the take function (and passed to the fold-lazy function as the "hint-argument") using the value of its argument n.
  • This is the first example of a generator which has its own state, in this case $fGenPrimes, which has as a state the last prime number that was generated
  • An infinite generator ($fGen-1-Inf) uses another generator ($fSeqM-to-N) in order to produce its own sequence of values
  • An infinite generator ($fZipper) uses two other infinite generators ($fGen-1-Inf and $fGenPrimes) and zips their corresponding values together to produce its own values.

Here is a screenshot of producing the first 100 000 prime numbers. Time taken is 6.5 seconds, and no additional data structure is used, thus approximately the same memory will be used regardless of the wanted number of N primes to be generated. No wild guess is made about what sequence of how many integers we need to have in order for it to contain the first 100 000 primes:

image