haskell / vector

An efficient implementation of Int-indexed arrays (both mutable and immutable), with a powerful loop optimisation framework .
Other
365 stars 139 forks source link

Weaken constraint on mapM #132

Open andrewthad opened 8 years ago

andrewthad commented 8 years ago

The type signatures of all of the monadic mapping functions (mapM, imapM, etc) have a Monad constraint on m. However, this can be weakened to Applicative. I do not know if this causes degradation of performance for certain Monads though.

RyanGlScott commented 8 years ago

I imagine using Monad here is intentional, since the mapM function from Traversable deliberately uses a Monad constraint. You're right that this could be weakened on base-4.8 or later, but for backwards compatibility reasons it'd be prudent to keep around variants that use Monad rather than Applicative, since there is still some old code floating around that declares Monad instances but not Applicative ones. Similarly for imapM, forM, and sequence.

That being said, Data.Traversable does contain Applicative counterparts to all of these functions (sequenceA vs. sequence, traverse vs. mapM, for vs. forM). It doesn't look like the vector library implements them directly (aside from this inefficient implementation), so it might be prudent to add them.

Shimuuar commented 8 years ago

It would require changes in stream fusion framework. Currently streams use Monad for sequencing of effects

andrewthad commented 8 years ago

Thanks for the feedback from both of you. It seems like weakening the constraints on the existing functions would be bad for multiple reasons. Would introducing something like itraverse, which I am under the impression would be incompatible with the stream fusion framework, be alright? That is, is there a rule in place that everything that operates on immutable vectors must have a streaming rewrite, or do some function lack this?

cartazio commented 8 years ago

It might not be possible because of the current fusion rep. If you can weaken the internals to get by with applicative it may be possible

On Thursday, July 28, 2016, Andrew Martin notifications@github.com wrote:

Thanks for the feedback from both of you. It seems like weakening the constraints on the existing functions would be bad for multiple reasons. Would introducing something like itraverse, which I am under the impression would be incompatible with the stream fusion framework, be alright? That is, is there a rule in place that everything that operates on immutable vectors must have a streaming rewrite, or do some function lack this?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132#issuecomment-235897160, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwp35xul8A7tVfmazMUY6meEOHwiKks5qaLEwgaJpZM4JXL0F .

andrewthad commented 8 years ago

@cartazio Sorry I wasn't clear. What I was asking is if it's alright to have a function that operates on immutable vectors that does not fuse.

cartazio commented 8 years ago

Boxed vectors have traverse already right? What would this applicative allow for the other formats? I'm a tad slow this morning so please expound.

On Thursday, July 28, 2016, Andrew Martin notifications@github.com wrote:

@cartazio https://github.com/cartazio Sorry I wasn't clear. What I was asking is if it's alright to have a function that operates on immutable vectors that does not fuse.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132#issuecomment-235907498, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwroly50Zz64ZrGgS0ygzkISfkws0ks5qaLmvgaJpZM4JXL0F .

andrewthad commented 8 years ago

The original reason I brought this up was because the indexed traversal imapM has a Monad constraint. It would be nice to have itraverse, which would weaken that to Applicative.

RyanGlScott commented 8 years ago

To be clear, I think there's two separate issues here:

  1. The current fusion representation is not amenable to implementing traverse :: Applicative m => (a -> m b) -> Vector a -> m (Vector b) (a.k.a. mapM with its Monad constraint relaxed to Applicative) natively, and as a consequence we can't implement itraverse natively either. To see why, consider the current definition of mapM in Data.Vector.Fusion.Streaming.Monadic:

    mapM :: Monad m => (a -> m b) -> Stream m a -> Stream m b
    mapM f (Stream step t) = Stream step' t
    where
      step' s = do
                  r <- step s
                  case r of
                    Yield x s' -> liftM  (`Yield` s') (f x)
                    Skip    s' -> return (Skip    s')
                    Done       -> return Done

    You'd need to somehow figure out how to make this Applicative. I can't see a way to do this (that doesn't involve changing the internals of Step/Stream somehow, which I don't feel qualified to do).

  2. vector doesn't export an standalone traverse/itraverse functions. To be honest, I don't see this as a problem, since the Traversable instance for Vector gives you an (inefficient) traverse, and it's straightforward to adopt that implementation for other Vectors. Similarly, you can get an (inefficient) itraverse via the TraversableWithKey Vector instance from vector-instances (TraversableWithKey comes from keys) or the TraversableWithIndex instance for Vector from lens.
sjakobi commented 4 years ago

Thanks for explaining the situation @RyanGlScott! :)

Are there any other array-like data structures that would offer an efficient itraverse? The TraverseWithKey instance for Array doesn't look so great either…

lehins commented 4 years ago

@sjakobi I am not sure why traverse doesn't get added to vector, I personally don't see anything preventing it from happening. Current implementation of mapM goes through a list, so the same can be done for traverse. Here is how mapM is implemented

mapM f = unstreamM . Bundle.mapM f . stream

unstreamM s = do
                xs <- MBundle.toList s
                return $ unstream $ Bundle.unsafeFromList (MBundle.size s) xs

So here is a traverse that does the same thing and has the same performance as mapM. Sure, it probably breaks fusion, but so does indexing into a vector.

traverseVector :: (Applicative f, Vector v1 a, Vector v2 b) => (a -> f b) -> v1 a -> f (v2 b)
traverseVector f v =
  unstream . Bundle.unsafeFromList (MBundle.size s) <$>
  Traversable.traverse f (unId (MBundle.toList s))
  where
    s = stream v

Side note - I'd submit a PR with this, but development process is pretty weird in vector, don't even know which branch to submit a PR to.

To answer your question

Are there any other array-like data structures that would offer an efficient itraverse?

That all being said:

cartazio commented 4 years ago

Err side note, I’ll have time to track master etc back into sync this holiday season. This fall has had a number of other commitments. It’s a valid problem and I’ve not had time to rejigger stuff since the bug fix release I did last time.

I’d like to avoid providing operations that break fusion all things being equal for operations that get used in point free / compose operator oriented code.

As for the primitive package, primitive has zero fusion aside from one rule which should be reverted in the near future.

If we’re adding operators that don’t fuse, then documenting how to have users reason about fusion is really important. But I don’t feel that there’s enough bandwidth or current literacy on fusion engineering for that to be a low hanging fruit.

I do hope to spend some time this winter experimenting with ways to make the rewrite rules tools for ghc more reliable / easy to reason about with vector in mind. But that veers into research engineering which is always unpredictable.

On Fri, Nov 29, 2019 at 7:41 PM Alexey Kuleshevich notifications@github.com wrote:

@sjakobi https://github.com/sjakobi I am not sure why traverse doesn't get added to vector, I personally don't see anything preventing it from happening. Current implementation of mapM goes through a list, so the same can be done for traverse. Here is how mapM is implemented

mapM f = unstreamM . Bundle.mapM f . stream

unstreamM s = do xs <- MBundle.toList s return $ unstream $ Bundle.unsafeFromList (MBundle.size s) xs

So here is a traverse that does the same thing and has the same performance as mapM. Sure, it probably breaks fusion, but so does indexing into a vector.

traverseVector :: (Applicative f, Vector v1 a, Vector v2 b) => (a -> f b) -> v1 a -> f (v2 b) traverseVector f v = unstream . Bundle.unsafeFromList (MBundle.size s) <$> Traversable.traverse f (unId (MBundle.toList s)) where s = stream v

Side note - I'd submit a PR with this, but development process is pretty weird in vector, don't even know which branch to submit a PR to.

To answer your question

Are there any other array-like data structures that would offer an efficient itraverse?

That all being said:

  • you can't have a Traversable (or FoldableWithKey for that matter) instance, because it is defined forall elements, but elements in vector have restrictions on them (i.e. Unbox, Prim and Storable)
  • traversing is significantly (x50) slower with Applicative than in a PrimMonad which can mutate an array while doing traversing

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132?email_source=notifications&email_token=AAABBQWLVUAGQXI6U7MAP6TQWGZD7A5CNFSM4CK4XUC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFPWKSY#issuecomment-559899979, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABBQRLCTOLWSDNBBE3JR3QWGZD7ANCNFSM4CK4XUCQ .

lehins commented 4 years ago

@cartazio If you wanna write it in a way such that it fuses similarly to how mapM does, then here is something that should work.

traverseVector f =
  (\s -> unstream . Bundle.unsafeFromList (MBundle.size s) <$>
  Traversable.traverse f (unId (MBundle.toList s))) . stream

As far as the input goes it should fuse (i.e stream . unstream rule), but for the output, I don't think any fusion is possible, that is why mapM output isn't fused either, I believe. Consider traversing with the function (a -> Maybe b), then you can't know if a new stream will be produced or will it be a Nothing until you applied that function to the last element of the input vector.

As far as primitive and massiv, the approach for traversal avoids any lists, but that doesn't make it faster or anything, it's just a different way of implementing it. I noted them just to answer the @sjakobi's question about other array libraries.

cartazio commented 4 years ago

Thanks! I really appreciate your remarks on this and other matters. I’ve not been the best at following up on some matter of late, and your participation really helps ! :)

On Sat, Nov 30, 2019 at 8:59 AM Alexey Kuleshevich notifications@github.com wrote:

@cartazio https://github.com/cartazio If you wanna write it in a way such that it fuses similarly to how mapM does, then here is something that should work.

traverseVector f = (\s -> unstream . Bundle.unsafeFromList (MBundle.size s) <$> Traversable.traverse f (unId (MBundle.toList s))) . stream

As far as the input goes it should fuse (i.e steam . unstream rule), but for the output, I don't think any fusion is possible, that is why mapM output isn't fused either, I believe. Consider traversing with the function (a -> Maybe b), then you can't know if a new stream will be produced or will it be a Nothing until you applied that function to the last element of the input vector.

As far as primitive and massiv, the approach for traversal avoids any lists, but that doesn't make it faster or anything, it's just a different way of implementing it. I noted them just to answer the @sjakobi https://github.com/sjakobi's question about other array libraries.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132?email_source=notifications&email_token=AAABBQT5VF4U2KGEZA5OYJLQWJWU5A5CNFSM4CK4XUC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFQHWUQ#issuecomment-559971154, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABBQT3BKD7PY7FILSQILDQWJWU5ANCNFSM4CK4XUCQ .