hfour / yads

An array-like data structure with a monoidal cache
8 stars 4 forks source link

Investigate the possibility to further generalize folds #2

Open bor0 opened 6 years ago

bor0 commented 6 years ago

Explanation

As can be seen in #1, the current design only considers a subset of folding functions, since it was built (with partial folds in mind) upon a specific use case (see https://gist.github.com/spion/4b31ec396c4cbfebede55558f6238891).

This can be a bit limiting in some cases, and it'd be interesting to generalize this library further to accept something more than just Monoids.

Why?

Because it's fun, and maybe the use cases will appear afterwards :)

Reproduction steps

const HundredMinus: MonoidObj<number> = Object.freeze({
    operation: (res, acc) => res - acc,
    identity: 100,
    getCacheValue: (leaf) => leaf.data
});

const data = [1, 3, 5];
const tree = fromArray(data);

console.log(tree.getField(HundredMinus)); // prints 91, OK

insert(tree, 3, [7]);

printtree(tree);
/* prints:
     R___
    /    \
   a      b
  / \    / \
 1   3  5   7
*/

console.log(tree.getField(HundredMinus)); // prints -84 instead of 84

Further explanation

I would expect the code above to work similarly as if I were folding a list. An example:

> (foldl (lambda (el res) (- res el)) 100 '(1 3 5))
91
> (foldl (lambda (el res) (- res el)) 100 '(1 3 5 7))
84

Potential solution

Let's re-consider the case from #1:

      R___
     /    \
    a      b
   / \    / \
  e1  e2 e3  e4

For the operation - we have that pf(a) = e1 - e2 and pf(b) = e3 - e4, but pf(a) - pf(b) = e1 - e2 - e3 + e4, which is different from pf(R) = e1 - e2 - e3 - e4.

If, however, we had a way to (optionally) specify an order relation for the elements, we could fix this issue. In the example above, if we consider the orders e4 > e3 > e2 > e1, then we will have:

  1. pf(a) = e2 - e1
  2. pf(b) = e4 - e3
  3. pf(R) = e4 - e3 - e2 - e1

There seems to be some related discussion here.

spion commented 6 years ago

I know I gave it as an example, but minus is not really the best one, because you can achieve the same with the equivalent associative operation + and getFieldValue: leaf => 0 - leaf.data

Also, the identity element isn't really an identity anymore :)

The example non-associative fold is equivalent to 100 - tree.getField(associativeMinusMonoid) where

{
    operation: (res, acc) => res + acc,
    identity: 0,
    getCacheValue: (leaf) => 0 - leaf.data
}

I have a hunch that only folds that can be equivalented by a monoid fold (plus some extra optional operation) can be cached.

bor0 commented 6 years ago

The identity element might not be used at all when the datastructure has > 0 elements, resulting with 1 - 3 - 5 - 7 = -14

Why is this the case? Folds have an accumulator and initial/last value that they build up to. In which case would the identity not be used?

I have a hunch that only folds that can be equivalented by a monoid fold (plus some extra optional operation) can be cached.

This is something I thought of, too. If we can prove that, this issue can be closed :D