lehins / massiv

Efficient Haskell Arrays featuring Parallel computation
BSD 3-Clause "New" or "Revised" License
385 stars 23 forks source link

generateM is broken for monads encoding nondeterminism #24

Closed Shimuuar closed 6 years ago

Shimuuar commented 6 years ago

Illustration is quite simple:

λ> import qualified Data.Massiv.Array as A
λ> xs = (A.generateM A.Seq 3 (\i -> [0..i]) :: [A.Array A.U Int Int])
λ> xs
[(Array U Seq (3)
  [ 0,0,0 ]),(Array U Seq (3)
  [ 0,0,1 ]),(Array U Seq (3)
  [ 0,0,2 ]),(Array U Seq (3)
  [ 0,1,0 ]),(Array U Seq (3)
  [ 0,1,1 ]),(Array U Seq (3)
  [ 0,1,2 ])]
λ> xs
[(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ]),(Array U Seq (3)
  [ 0,1,2 ])]

As it could be seen all vectors share same buffer so it becomes overwritten. I thing only possible fix is to strengthen Monad constraint to PrimMonad and use it to thread state token.

Shimuuar commented 6 years ago

Yes proposed fix seems to fix problem for list monad but it relies on particular order of list traversal and thus very fragile. So array updates and copying of buffer are interleaved just in correct way so for example with A.generateM A.Seq 3 (\i -> [0..1]) we'll have following sequence of evaluation:

buf[0] = 0
buf[1] = 0
buf[2] = 0
freeze!
buf[2] = 1
freeze!
buf[1] = 1
buf[2] = 0
freeze!
buf[2] = 1
freeze!
...

Thus buffer updates do not interfere with each other since they're done in stack-like manner. But should we change order of traversal of possible lists it's no longer the case! Should we take for example Omega monad fix will break. It's not proper monad since associativity only holds up to permutations of resulting list. Still I think massiv should break on such monads

λ> A.generateM A.Seq 3 (\i -> Omega [0..1]) :: Omega (A.Array A.P Int Int)
Omega {runOmega = [(Array P Seq (3)
  [ 0,0,0 ]),(Array P Seq (3)
  [ 1,0,0 ]),(Array P Seq (3)
  [ 1,0,0 ]),(Array P Seq (3)
  [ 1,1,0 ]),(Array P Seq (3)
  [ 1,1,0 ]),(Array P Seq (3)
  [ 1,1,1 ]),(Array P Seq (3)
  [ 1,1,1 ]),(Array P Seq (3)
  [ 1,1,1 ])]}

All in all it seems that only solution that works for any monad will require accumulation of results in some persistent data structure as opposed to ephemeral (aka mutable) buffer and corresponding performance hit. Another possibility is to constrain monad enough so we'll have only single sequence of events (IO, ST, StateT, Reader, Identity, what else?) However it's not clear to me how to attain that

Shimuuar commented 6 years ago

I thought about this problem a bit more and this problem requires systematic treatment. Basically what we're doing is working in STT monad transformer:

newtype STT m a = STT (State# s → m (State# s, a))

In this formulation problem becomes much more obvious: in monads supporting nondeterminism state token could be reused which will cause problems with mutable state. So it's only safe to use this trick with monads where token will be used at most once: state, writer, reader, error. Question is how to express it

lehins commented 6 years ago

I'll close this issue in favor of #52, since previous buggy implementation has been disabled and needs a complete rewrite.

lehins commented 5 years ago

I think I found a solution to this problem, but I am still not 100% confident. I posted a minimal example on stackoverflow: Is it safe to interleave manual realWorld# state passing with an arbitrary Monad. @Shimuuar, I'd really appreciate your input if you get a chance to take a look at it.

Regardless of the answer, doing mapM through a list (converting an array to a list, mapM over the list and convert it back to array) seems to result in decent performance thanks to fusion, so that we'll be a fallback implementation

Shimuuar commented 5 years ago

I think it should work. Essentially you build a closure which will write into supplied buffer in the end. Very similar to expressing foldr via foldl. One question is performance since it's not too different from using lists as intermediate. You still build persistent data structure only in form of closures instead of lists

I said it should work but I'm not 100% sure. Maybe there's some really perverted monad where this approach will break

lehins commented 5 years ago

That was precisely my thinking. I guess, we'll not know for 100% until we put it out into the wild and let the world find out, if there is such perverted monad (applicative) out there :) As far as performance goes, I did benchmark it against using lists and it is about 2 times faster, but only for ghc-8.0 and 8.2, not the latest ones. Important part is that it is never slower.

lehins commented 5 years ago

@Shimuuar Thanks for checking up on it!