uncomplicate / fluokitten

Category theory concepts in Clojure - Functors, Applicatives, Monads, Monoids and more.
http://fluokitten.uncomplicate.org
Eclipse Public License 1.0
468 stars 35 forks source link

`fold` breaks on large vectors? #20

Closed ysmiraak closed 7 years ago

ysmiraak commented 7 years ago

With version 0.5.1 on clojure version 1.8.0, I encountered this behavior:

(fold (range 99))        => 4851
(fold (range 999))       => 498501
(fold (vec (range 99)))  => 4851
(fold (vec (range 999))) => 0

(fold (vec (range 512))) => 130816
(fold (vec (range 513))) => 0

It seems that fold breaks for vectors with more than 512 elements?

blueberry commented 7 years ago

Thank you for catching this bug. I found the source, and will fix it in the next release, that is due in a month approximately.

ysmiraak commented 7 years ago

Thanks for letting me know. Looking forward to the update :)

blueberry commented 7 years ago

@ysmiraak 0.5.2 is in the clojars. Please let me know whether it works well for you now.

ysmiraak commented 7 years ago

Yes it works well now. Thanks for the early release.

ysmiraak commented 7 years ago

I was reading the changes you made, trying to understand what happened. I noticed that your fold for one collection calls r/fold. I'm quite excited because this means I could replace reducers by fluokitten. However, when thinking about the differences between reduce, fold, and r/fold, I found this inconsistency:

This example in one of the guides (fmap + {:a 1 :b 2} {:a 3 :b 4 :c 5} {:a 6 :b 7 :c 8}) => {:a 10, :b 13, :c 13} made me think that I could replace merge-with by fmap.

(def m+ [{:a 1 :b 2} {:a 3 :b 4 :c 5} {:a 6 :b 7 :c 8}])

(apply (partial merge-with +) m+)     => {:a 10, :b 13, :c 13}
(reduce (partial merge-with +) m+)    => {:a 10, :b 13, :c 13}
(reduce (partial merge-with +) {} m+) => {:a 10, :b 13, :c 13}

(apply (partial fmap +) m+)     => {:a 10, :b 13, :c 13}
(reduce (partial fmap +) m+)    => {:a 10, :b 13, :c 13}
(reduce (partial fmap +) {} m+) => {:a 10, :b 13, :c 13}

And that replacement works for reduce, but not for fold.

(fold (partial merge-with +) m+)    => {:a 10, :b 13, :c 13}
(fold (partial merge-with +) {} m+) => {:a 10, :b 13, :c 13}

(fold (partial fmap +) m+)    => #function[uncomplicate.fluokitten.algo/function-fmap/fn--25003]
(fold (partial fmap +) {} m+) => IllegalArgumentException

The problem is that the fold in 0.5.2 calls (r/fold reducef coll), which uses (reducef) for obtaining the identity element. While (merge-with +) produces nil which is then treated the same as {} by merge-with, (fmap +) behaves the same as currying.

But I don't think fmap should be changed.

Instead of this definition in the algo namespace,

(defn collection-fold
  ...
  ([c f init]
   (f init (r/fold f c)))
  ...)

if we simply provide the reducef as required

(letfn [(collection-fold [c f init]
          (r/fold (fn ([] init)
                    ([x y] (f x y)))
                  c))]
  (collection-fold m+ (partial fmap +) {}))
=> {:a 10, :b 13, :c 13}

then (fold (partial fmap +) {} m+) won't produce IllegalArgumentException anymore.

However, (fold (partial fmap +) m+) with no init value still produces a closure instead of the "expected" result, because (fmap +) is used here as init, which is not the identity element for m+.

But I think this is the right behavior, because anyone who calls (fold f x) must make sure that f is a monoid operator with its own identity element, otherwise they should just call (fold f init x).

PS: The same problem exists for foldmap. PPS: I was worried that since the reducef is likely to be called many times, wrapping it to provide the init value can be inefficient. But I benched it with criterium, and found no difference at all. It's probably JIT optimized away, so it probably doesn't matter for large collections. And for small collections, r/fold is less efficient than reduce anyways.

ysmiraak commented 7 years ago

On second thought, maybe (fold f init x) should not call r/fold at all?

Because the init value is not necessarily the identity element of some monoid, and r/fold is only designed to work with monoids.

When f is not a monoid operator, maybe the behavior of fold should fallback to reduce?

blueberry commented 7 years ago

Thank you for investigating this, these are all valid and valuable insights. I optimized this code thoroughly long time ago, so I'd need to think about the details of more exotic uses (such as this one). If you have time, it would be great if you could provide test cases that demonstrate this desired and undesired behavior, so when I get to this, it is clear to me. Also, thank you very much for benchmarking this - highly appreciated!

The tricky point is that all these things also need to work with functions as monads, and currying functions as monads (jvm namespace). Maybe some more tests and examples are needed. Any definitively some more investigation of these edge cases.

The previous solution was a fix for the more immediate issue, which was a bug that was caused by wrong usage of r/fold.

blueberry commented 7 years ago

When f is not a monoid operator, maybe the behavior of fold should fallback to reduce?

The problem is: how to know whether f is a monoid operator for the monoid that is inside the collection? On top of that, I suppose that every function could be a monoid operator for some monoid that we can imagine. The only function that will for sure be a monoid operator for the given monoid is op (provided that there are no bugs in its implementation :) but even op needs to know which monoid is the context that we are working with. Additonal hurdle is that the requirement that (f) should give the id element is the implementation convention from r/fold, which I am not sure is required generally.

ysmiraak commented 7 years ago

The tricky point is that all these things also need to work with functions as monads, and currying functions as monads (jvm namespace). Maybe some more tests and examples are needed.

I'm just starting to learn category theory, so it'll take some time for me to understand how these things should work with monads. In fact, I'm trying to learn CT with fluokitten. I really like the API design, and how it makes an effort to blend in with clojure. I'll try to collect some test cases as I learn, but for now, this is what I think of the problem:

I suppose that every function could be a monoid operator for some monoid that we can imagine.

I think tree-like folds cannot be said to work on any monoid: (reduce conj [] (range 9)).

conj wouldn't be a monoid operator because it's not associative: (conj (conj [] 0) 1)) is not the same as (conj [] (conj 0 1)).

The problem is: how to know whether f is a monoid operator for the monoid that is inside the collection?

I guess we simply can't. Clojure has three kinds of folds:

  1. monoid-fold: (r/fold f coll);
  2. left-fold: (reduce f coll);
  3. tree-fold: (reduce f init coll).

Each one is more general than the previous one.

And your fold is even more general, allowing f to be any Lisp function instead of a binary operator. I really love this, but I guess it's uncharted territory to integrate the Lisp formalism with CT, so essentially it's about how you wish your fold to behave:

a. As the canonical monoid-fold. f has to be a monoid operator, and the arity overloading is simply sugar.

b. As a swiss-army-knife fold: combining different kinds of folds through arity overloading.

Clearly plan a is easier to get right, because it's simply the sugared version of Haskell's fold. Plan b would be experimental but exciting; It seems redundant, considering we already have reduce and r/fold, but then if it can be well-defined, it could replace both, and that would be a great success in integrating Lisp with CT. The challenge, however, is, as you said, how to make it work properly with the other parts of your library.

Additonal hurdle is that the requirement that (f) should give the id element is the implementation convention from r/fold, which I am not sure is required generally.

I agree. But I also agree with the implementation of r/fold, which requires that the identity element is bundled with either reducef or combinef, because that (sort of) guarantees the correctness when the computation is parallelized. If r/fold allows an init value to be specified, people might not realize that this init has to be the identity element.

Which is why I said that maybe (fold f init x) should not call r/fold, unless you prefer plan a, in which case your fold requires f to be a monoid operator and init to be the identity element.

blueberry commented 7 years ago

conj wouldn't be a monoid operator because it's not associative:

If I imagine a monoid that consists of empty vectors, conj is very much associative. That is the catch, it all has to (also) work for trivial edge cases.

init does not have to be identity element. If it were, it would be unnecessary. The intended use case is the same as in reduce: (fold + 4 [1 2 3]) => 10. The only requirement for init is to be the same monoid as the context of the collection (which is also a monoid :)

blueberry commented 7 years ago

Maybe the simplest solution would be to add the [x f] arity to the fold protocol, and let the monoid in question (in this case, collection) decide how to handle that. Then, we wouldn't have this problem of how to find out what the id element is: every implementation will be able to discover it using the internal knowledge. What do you think about that?

ysmiraak commented 7 years ago

Maybe the simplest solution would be to add the [x f] arity to the fold protocol, and let the monoid in question (in this case, collection) decide how to handle that.

Wouldn't this make (fold * [1 2 3]) return 0? Because (-> [1 2 3] first id) is 0.

blueberry commented 7 years ago

Hm, yes, I forgot that :) It seems that I made those decisions with reason back when I wrote that code :)

blueberry commented 7 years ago

I think I found an elegant and general solution to this. I'll implement this when I get back.

blueberry commented 7 years ago

Changed in the last 0.6.0-SNAPSHOT