Open rhdunn opened 1 year ago
Nice summary, thank you.
Such a generator could easily be used in iterative loops or recursive function calls. The following query attaches a result item to the generator:
iterate-while(
map:put(create-generator(), 'result', 0),
function($generator) {
exists($generator?value) and $generator?result < 10000
},
function($generator) {
let $next := $generator?next()
return map:put($next, 'result', $generator?result + $generator?value)
}
)?sum
In my example, I expect the generator to always return a map, as the intermediate result would otherwise get lost. Maybe we could also have a more
entry or function to tackle that?
iterate-while(
$generator,
fn { ?more },
fn { ?next() => map:put('seq', (?seq, ?value)) }
)?seq
It would also be nice if enriched map entries were implicitly inherited to the next generator.
Regarding fn:sequences
, if we had sequences with variable size, we could also allow infinity in range expressions (1 to xs:double('INF')
, etc.) and many other things. But that feels like a lot of effort, and we should certainly define some serious use cases before continuing that.
I find many desirable properties of generators missing in this otherwise nice attempt.
Therefore, I will be submitting a separate proposal for generators, that fills these voids.
One of course, is what @ChristianGruen duly noticed -- in the general case a generator needs a state (map) in order to continue its work.
@dnovatchev It would be helpful if you could list the desirable generator properties that are missing.
Note that the following works without having to keep state in the map:
xquery version "3.1";
(: https://en.wikipedia.org/wiki/Linear_congruential_generator :)
declare function local:lcg($seed, $a, $b, $m) {
let $value := ($a * $seed + $b) mod $m
return map {
"value": $value,
"next": function () {
local:lcg($value, $a, $b, $m)
}
}
};
(: ZX81 random number generator LCG values :)
let $r1 := local:lcg(100, 75, 74, math:pow(2, 16) + 1)
let $r2 := $r1?next()
let $r3 := $r2?next()
let $r4 := $r3?next()
let $r5 := $r4?next()
return ($r1?value, $r2?value, $r3?value, $r4?value, $r5?value)
I've not tried this for @ChristianGruen's example. However:
more as xs:boolean?
or has-next as xs:boolean?
field on the record -- if missing/empty it would be true. Note: if this is required, fn:random-number-generator
would need to be further modified to add this field.next as function(sequence-generator? := ()) as sequence-generator
. [1] -- Note: this takes an optional self value in order to be backward compatible with random-number-generator.[1] The current specification only allows default arguments on function declarations, so this does not currently work. It would need to be defined at least for inline function expressions -- allowing function ($self as sequence-generator? := ()) { ... }
. Allowing them on function tests as well would allow the record test for this to be accurately specified.
https://github.com/qt4cg/qtspecs/issues/716#issuecomment-1812146751 @rhdunn The discussion could be continued here (if you don’t mind… Otherwise, feel free to open a new issue):
With my
fn:sequence
proposal, you could write, e.g.:let $gen := fn:random-number-generator() (: or other lazy sequence :) for $value in fn:sequence($gen)[1 to 10] return math:sin($value)
I wonder if we can make this work. The following variant would certainly be much easier to achieve:
let $gen := fn:random-number-generator() (: or other lazy sequence :) for $value in fn:sequence($gen, 1, 10) return math:sin($value)
The way I see this working in practice for an implementer is that they would implement the iteration model/interface in the language the processor is running on:
__iter__
/__next__
in Python;The implementer would then adapt that to their model of an XDM sequence in the fn:sequence
function implementation.
Thinking about this some more, the iterator model is too klunky. Adopting something similar to generateSequence
would be a lot cleaner. I.e.:
declare function fn:sequence($next as function(item()*) as item()*, $seed as item()*) as item()*;
where returning an empty sequence would stop the sequence.
declare function fn:random-numbers($seed as xs:anyAtomicType? := ()) {
fn:sequence(fn { .?next() }, fn:random-number-generator($seed)) => fn { .?number }
};
declare function fn:fibonacci($first := 0, $second := 1) {
fn:sequence(fn { (.[2], .[1] + .[2] ) }, ($first, $second)) => fn { .[1] }
};
Note: This is adapted from the example in Kotlin's generateSequence docs.
Motivation
The motivations for this are to explore the creation of sequences where the next item in the sequence is determined by evaluating a function on the current state of a system. These sequence generators have an initial starting value and state. They can also stop when some condition is met.
The motivating example here is the fn:random-number-generator function, where:
let $rnd := fn:random-number-generator()
initializes a new random number sequence with the default seed as its state;let $value := $rnd?number
returns the current value of the sequence;let $rnd := $rnd?next()
returns the state and value of the next item in the sequence.This has all the properties needed for a forward (left-to-right) generating sequence. To make it a generalized generator sequence, the
number
field should be renamedvalue
.NOTE: The
fn:random-number-generator
function defines an infinite sequence as it has no termination/end of sequence condition.Sequence Generators as Record Types
Therefore, a forward generator sequence could look like this:
If calling
?next()
on a sequence above returns the empty sequence then there are no more values in that sequence.If the
generated-sequence()
returns the empty sequence then there are no items in the sequence.NOTE: This does not currently define reversible sequence support. A
reversed
property could be provided that returns asequence-generator
that operates on the sequence from right to left. I haven't figured out exactly how this should look, but reversed generator sequences may be better investigated as a separate issue.NOTE: An implementation can process this sequence iteratively if needed.
Analysis
While this allows generator sequence types to be created, they are -- like
fn:random-number-generator()
cumbersome to use. This does have the advantage of being backward compatible withfn:random-number-generator()
, though.The problem comes when trying to make these work like sequences. The subtype and function coercion rules should be doable. The other sequence operations like filtering are more complicated to define properly.
However, this has the same issues that allowing
array()
infn:*
functions has -- how do you differentiate the use cases where the user is working on a sequence of generators, or the generated sequences?Sequence Generator Function
The Kotlin language has a generateSequence function that takes a next function, an optional seed value or construction function, and returns a lazy sequence over that. -- Internally, it is building a Java iterator that produces values from calling the next function. The sequence will terminate when the next value is null.
I propose that -- in addition to the
sequence-generator
type above -- XPath defines the following function:This solves the issues in the Analysis section above, and is analogous to
array:values
. It is implementation defined how the sequence is constructed. -- This allows an implementer to appropriately map the generator to their internal sequence implementation in order to provide lazy evaluation and other operations.There should also be the following helper function for random numbers:
The user-defined sequences then become e.g.: