robrix / Traversal

Enumeration & iteration of collections in Swift.
MIT License
58 stars 1 forks source link

Flatten maps occur out of order #26

Closed robrix closed 9 years ago

robrix commented 9 years ago

A flatten map of >1-element reducibles ends up being bizarrely out of order in some sort of weird confluence of reversal and associativity mind-games :boom:

robrix commented 9 years ago

The actual symptom is that Stream(ReducerOf(…)) where the map produces >1 reducer will produce an incorrectly ordered Stream.

This appears to be the result of conflict between iteration done in Stream(ReducibleType) and the nested reduction performed by ReducerOf.reducer() in its combine closure: it calls Traversal.reduce() to reduce the inner reducers, but the combining function which it receives from Stream returns the element and ignores into, and everything gets very confusing.

robrix commented 9 years ago

The specific issue is that ReducerOf is not obeying the semantics of ReducibleType correctly: it isn’t calling recur precisely when it ought to, and it isn’t providing it with the precise state necessary to continue reduction from the next element (which I’ll term “the continuation” for the purposes of this explanation, by analogy with call/cc).

To illustrate, the reduction of [1, 2, 3] from 0 using + has a call stack that looks something like this:

(Recall that this is a left fold, i.e. the return value of the combine function is passed to the next iteration.)

Crucially, a flatten map of e.g. [[], [1, 2], [], [3], []] has to have the exact same call stack. That means that the ReducerOf must be able to produce the continuation for any point during its (internal) reduction of the nested collections. It needs to pass enough state forwards to:

Thus, the initial call to reduce must call the combining function with 1, and then ReducerOf must call recur with a continuation suitable for reducing from 2, i.e. after the first element in the second element of the outer collection.

We can see why failing to do so causes such interesting test failures. Producing a Stream performs the recurrences lazily using Memo, flattening the previously recursive call stack down to iteration:

Getting some combination of ordering & recurrence wrong will, trivially, produce an entirely incorrect Stream.

robrix commented 9 years ago

The actual control flow for ReducerOf is, naturally, more complicated. It involves the reducible which it itself is implemented over. For example:

let inner: Stream<Stream<Int>> = Stream([ Stream([ 1, 2 ]), .Nil, .unit(3)])
let reducer = ReducerOf(inner, id) // straightforward flatten map

A naïve implementation of reducer will simply reduce each element of inner passing in the initial value and the combining function. However, that results in a call stack that looks like this:

(Bolded lines indicate calls to the caller’s combining function; these are the points at which we have the values that must be passed to the caller’s recur function.)

Being nonlinear, we can’t simply memoize the continuation. We want to linearize this, such that the recurrences guard all further traversal:

/cc @mdiep Markdown rendering has omitted the previous line :point_up:

That looks like what we want. So how can we achieve this?

We need to pass the continuation to the flattening reduction of inner’s elements such that it is passed to recur.

robrix commented 9 years ago

We don’t have any control over the initial parameter or its type. We have extremely limited control over the combine parameter, but not its type. We have control over the collection parameter.

However, the continuation depends on the Result type, which is a type parameter of the reducer() method. So we can’t pass the continuation in as (part of) the collection parameter, because that would require the type to be parameterized by Result. We need quantifiers for this.

robrix commented 9 years ago

@jckarter notes that higher-ranked function types can be emulated by factoring the function into a method on a polymorphic type:

mdiep commented 9 years ago

Markdown rendering has omitted the previous line

You're hitting the depth limit for markdown.

robrix commented 9 years ago

Aha, thank you! :bow: Really weird that it does the bullet for it anyway.

robrix commented 9 years ago

This should not have been closed.

robrix commented 9 years ago

Streams capture the current continuation as well, in a sense, by memoization. This yields a function () -> T which performs the remaining computation. If we can e.g. capture the remaining work in a Memo and pass provide that as an argument to recur, then we don’t have to worry about integrating the Result type into the type of the collection argument.

robrix commented 9 years ago

Heh, and this time I forgot to close it from the pull request.

Fixed by #34.