Open andrewthad opened 8 years ago
I imagine using Monad
here is intentional, since the mapM
function from Traversable
deliberately uses a Monad
constraint. You're right that this could be weakened on base-4.8
or later, but for backwards compatibility reasons it'd be prudent to keep around variants that use Monad
rather than Applicative
, since there is still some old code floating around that declares Monad
instances but not Applicative
ones. Similarly for imapM
, forM
, and sequence
.
That being said, Data.Traversable
does contain Applicative
counterparts to all of these functions (sequenceA
vs. sequence
, traverse
vs. mapM
, for
vs. forM
). It doesn't look like the vector
library implements them directly (aside from this inefficient implementation), so it might be prudent to add them.
It would require changes in stream fusion framework. Currently streams use Monad for sequencing of effects
Thanks for the feedback from both of you. It seems like weakening the constraints on the existing functions would be bad for multiple reasons. Would introducing something like itraverse
, which I am under the impression would be incompatible with the stream fusion framework, be alright? That is, is there a rule in place that everything that operates on immutable vectors must have a streaming rewrite, or do some function lack this?
It might not be possible because of the current fusion rep. If you can weaken the internals to get by with applicative it may be possible
On Thursday, July 28, 2016, Andrew Martin notifications@github.com wrote:
Thanks for the feedback from both of you. It seems like weakening the constraints on the existing functions would be bad for multiple reasons. Would introducing something like itraverse, which I am under the impression would be incompatible with the stream fusion framework, be alright? That is, is there a rule in place that everything that operates on immutable vectors must have a streaming rewrite, or do some function lack this?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132#issuecomment-235897160, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwp35xul8A7tVfmazMUY6meEOHwiKks5qaLEwgaJpZM4JXL0F .
@cartazio Sorry I wasn't clear. What I was asking is if it's alright to have a function that operates on immutable vectors that does not fuse.
Boxed vectors have traverse already right? What would this applicative allow for the other formats? I'm a tad slow this morning so please expound.
On Thursday, July 28, 2016, Andrew Martin notifications@github.com wrote:
@cartazio https://github.com/cartazio Sorry I wasn't clear. What I was asking is if it's alright to have a function that operates on immutable vectors that does not fuse.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132#issuecomment-235907498, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwroly50Zz64ZrGgS0ygzkISfkws0ks5qaLmvgaJpZM4JXL0F .
The original reason I brought this up was because the indexed traversal imapM
has a Monad
constraint. It would be nice to have itraverse
, which would weaken that to Applicative
.
To be clear, I think there's two separate issues here:
The current fusion representation is not amenable to implementing traverse :: Applicative m => (a -> m b) -> Vector a -> m (Vector b)
(a.k.a. mapM
with its Monad
constraint relaxed to Applicative
) natively, and as a consequence we can't implement itraverse
natively either. To see why, consider the current definition of mapM
in Data.Vector.Fusion.Streaming.Monadic
:
mapM :: Monad m => (a -> m b) -> Stream m a -> Stream m b
mapM f (Stream step t) = Stream step' t
where
step' s = do
r <- step s
case r of
Yield x s' -> liftM (`Yield` s') (f x)
Skip s' -> return (Skip s')
Done -> return Done
You'd need to somehow figure out how to make this Applicative
. I can't see a way to do this (that doesn't involve changing the internals of Step
/Stream
somehow, which I don't feel qualified to do).
vector
doesn't export an standalone traverse
/itraverse
functions. To be honest, I don't see this as a problem, since the Traversable
instance for Vector
gives you an (inefficient) traverse
, and it's straightforward to adopt that implementation for other Vector
s. Similarly, you can get an (inefficient) itraverse
via the TraversableWithKey Vector
instance from vector-instances
(TraversableWithKey
comes from keys
) or the TraversableWithIndex
instance for Vector
from lens
.Thanks for explaining the situation @RyanGlScott! :)
Are there any other array-like data structures that would offer an efficient itraverse
? The TraverseWithKey
instance for Array
doesn't look so great either…
@sjakobi I am not sure why traverse
doesn't get added to vector
, I personally don't see anything preventing it from happening. Current implementation of mapM
goes through a list, so the same can be done for traverse. Here is how mapM
is implemented
mapM f = unstreamM . Bundle.mapM f . stream
unstreamM s = do
xs <- MBundle.toList s
return $ unstream $ Bundle.unsafeFromList (MBundle.size s) xs
So here is a traverse
that does the same thing and has the same performance as mapM
. Sure, it probably breaks fusion, but so does indexing into a vector.
traverseVector :: (Applicative f, Vector v1 a, Vector v2 b) => (a -> f b) -> v1 a -> f (v2 b)
traverseVector f v =
unstream . Bundle.unsafeFromList (MBundle.size s) <$>
Traversable.traverse f (unId (MBundle.toList s))
where
s = stream v
Side note - I'd submit a PR with this, but development process is pretty weird in vector
, don't even know which branch to submit a PR to.
To answer your question
Are there any other array-like data structures that would offer an efficient itraverse?
massiv
has an itraverseA
for multidimensional arrays.primitive
that is a dependency of vector has an 'itraversePrimArray`That all being said:
Traversable
(or FoldableWithKey
for that matter) instance, because it is defined forall
elements, but elements in vector have restrictions on them (i.e. Unbox
, Prim
and Storable
)Applicative
than in a PrimMonad
which can mutate an array while doing traversingErr side note, I’ll have time to track master etc back into sync this holiday season. This fall has had a number of other commitments. It’s a valid problem and I’ve not had time to rejigger stuff since the bug fix release I did last time.
I’d like to avoid providing operations that break fusion all things being equal for operations that get used in point free / compose operator oriented code.
As for the primitive package, primitive has zero fusion aside from one rule which should be reverted in the near future.
If we’re adding operators that don’t fuse, then documenting how to have users reason about fusion is really important. But I don’t feel that there’s enough bandwidth or current literacy on fusion engineering for that to be a low hanging fruit.
I do hope to spend some time this winter experimenting with ways to make the rewrite rules tools for ghc more reliable / easy to reason about with vector in mind. But that veers into research engineering which is always unpredictable.
On Fri, Nov 29, 2019 at 7:41 PM Alexey Kuleshevich notifications@github.com wrote:
@sjakobi https://github.com/sjakobi I am not sure why traverse doesn't get added to vector, I personally don't see anything preventing it from happening. Current implementation of mapM goes through a list, so the same can be done for traverse. Here is how mapM is implemented
mapM f = unstreamM . Bundle.mapM f . stream
unstreamM s = do xs <- MBundle.toList s return $ unstream $ Bundle.unsafeFromList (MBundle.size s) xs
So here is a traverse that does the same thing and has the same performance as mapM. Sure, it probably breaks fusion, but so does indexing into a vector.
traverseVector :: (Applicative f, Vector v1 a, Vector v2 b) => (a -> f b) -> v1 a -> f (v2 b) traverseVector f v = unstream . Bundle.unsafeFromList (MBundle.size s) <$> Traversable.traverse f (unId (MBundle.toList s)) where s = stream v
Side note - I'd submit a PR with this, but development process is pretty weird in vector, don't even know which branch to submit a PR to.
To answer your question
Are there any other array-like data structures that would offer an efficient itraverse?
- massiv https://github.com/lehins/massiv has an itraverseA https://www.stackage.org/haddock/nightly-2019-11-29/massiv-0.4.4.0/Data-Massiv-Array.html#v:itraverseA for multidimensional arrays.
- Hell, even primitive that is a dependency of vector has an 'itraversePrimArray` https://www.stackage.org/haddock/nightly-2019-11-29/primitive-0.7.0.0/Data-Primitive-PrimArray.html#v:itraversePrimArray
That all being said:
- you can't have a Traversable (or FoldableWithKey for that matter) instance, because it is defined forall elements, but elements in vector have restrictions on them (i.e. Unbox, Prim and Storable)
- traversing is significantly (x50) slower with Applicative than in a PrimMonad which can mutate an array while doing traversing
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132?email_source=notifications&email_token=AAABBQWLVUAGQXI6U7MAP6TQWGZD7A5CNFSM4CK4XUC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFPWKSY#issuecomment-559899979, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABBQRLCTOLWSDNBBE3JR3QWGZD7ANCNFSM4CK4XUCQ .
@cartazio If you wanna write it in a way such that it fuses similarly to how mapM
does, then here is something that should work.
traverseVector f =
(\s -> unstream . Bundle.unsafeFromList (MBundle.size s) <$>
Traversable.traverse f (unId (MBundle.toList s))) . stream
As far as the input goes it should fuse (i.e stream . unstream
rule), but for the output, I don't think any fusion is possible, that is why mapM
output isn't fused either, I believe. Consider traversing with the function (a -> Maybe b)
, then you can't know if a new stream will be produced or will it be a Nothing
until you applied that function to the last element of the input vector.
As far as primitive
and massiv
, the approach for traversal avoids any lists, but that doesn't make it faster or anything, it's just a different way of implementing it. I noted them just to answer the @sjakobi's question about other array libraries.
Thanks! I really appreciate your remarks on this and other matters. I’ve not been the best at following up on some matter of late, and your participation really helps ! :)
On Sat, Nov 30, 2019 at 8:59 AM Alexey Kuleshevich notifications@github.com wrote:
@cartazio https://github.com/cartazio If you wanna write it in a way such that it fuses similarly to how mapM does, then here is something that should work.
traverseVector f = (\s -> unstream . Bundle.unsafeFromList (MBundle.size s) <$> Traversable.traverse f (unId (MBundle.toList s))) . stream
As far as the input goes it should fuse (i.e steam . unstream rule), but for the output, I don't think any fusion is possible, that is why mapM output isn't fused either, I believe. Consider traversing with the function (a -> Maybe b), then you can't know if a new stream will be produced or will it be a Nothing until you applied that function to the last element of the input vector.
As far as primitive and massiv, the approach for traversal avoids any lists, but that doesn't make it faster or anything, it's just a different way of implementing it. I noted them just to answer the @sjakobi https://github.com/sjakobi's question about other array libraries.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/132?email_source=notifications&email_token=AAABBQT5VF4U2KGEZA5OYJLQWJWU5A5CNFSM4CK4XUC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFQHWUQ#issuecomment-559971154, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABBQT3BKD7PY7FILSQILDQWJWU5ANCNFSM4CK4XUCQ .
The type signatures of all of the monadic mapping functions (mapM, imapM, etc) have a
Monad
constraint onm
. However, this can be weakened toApplicative
. I do not know if this causes degradation of performance for certainMonad
s though.