Closed ocharles closed 1 year ago
The version in Data.Functor
can be no better than that one. Data.Traversable
can do rather more, in principle, using something like Data.Biapplicative.traverseBia
.
Ok, I am fine with the version I proposed. Are you saying it is not suitable for Data.Functor
?
I'm just saying that there are more and more options the more is known about the structure. The bare minimum version would fit very well in Data.Functor
, but I don't think I'd personally find it very useful.
It is also worth noting the existence of munzip
in Control.Monad.Zip
, which is almost always the same as the functor-based definition. Except it uses liftM
internally and inefficiently for historical reasons. (I was able to come up with a couple of monads for which it actually helped to have it in the class, but they are decidedly weird.) I'm not calling for concrete action around this fact, just raising awareness.
unzip
could be a member of Functor
type-class, for the same reason (<$)
is, there can be more efficient implementations: [a]
and NonEmpty a
come to mind immediately.
And as there is default implementation, it shouldn't be too intrusive change either.
I would move the function, and only then worry about making it a class member. Baby steps.
@phadej, I'm not sure making it a method magically solves the problems. The strictness and other properties of the most "obvious" unzip depend on the type, which makes a class method hard to reason about. Furthermore, one may want less-obvious unzips in certain cases. Ignoring the strictness of the outermost pair (because it's boring), there's just one unzip that makes sense for every Functor
, which is also boring. There are two additional ones that make sense for every Traversable
. Some types have sensible unzips that are none of these, such as lists (but, perhaps surprisingly, not Data.Sequence
sequences). Note that while the lazy Traversable
unzip has less space leak potential than the Functor
one, it will actually use significantly more memory in certain call site contexts.
@treeowl it doesn't need to solve all the problems. Is there drawbacks of unzip
being a class member?
No ordinary person will be able to choose from three+ different unzip
s for lists.
@phadej, I'm concerned for the poor programmer who has to implement the method. From a practical perspective, I think most applications/call sites will want one of three things.
Another concern: I don't actually think non-traversable functors necessarily have sensible unzips. For example, you can use the simple fmap
thing to "unzip" an IO
action, but that doesn't strike me as something very reasonable to do.
@treeowl
but that doesn't strike me as something very reasonable to do.
So are you against adding general
unzip :: Functor f => f (a,b) -> (f a, f b)
unzip xs = (fst <$> xs, snd <$> xs)
as it can be used with IO
?
unzipWith :: (c -> (a, b)) -> f c -> (f a, f b)
Is slightly more general, and composes better. (Think a variant for Compose
).
Perhaps it would help for me to supply a bit more motivation for why I want this? Here is a use of Data.List.NonEmpty.unzip
in my current work project (none of which are used on NonEmpty
lists):
let (onFailedToSaveImage, onSuccessfullySavedImage) =
unzip $
inputs.onSavedImage <&> \e ->
split $
e <&> \(image, res) -> do
case res of
Left e -> Left e
Right () -> Right image
Where the thing being unzip
ed here (inputs.onSavedImage ..
) is :: Cameras (Event SomeException, Event ())
. Cameras
is a two-tuple as a functor, and Event
is from reactive-banana
.
This isn't much, but it felt wrong to import Data.List.NonEmpty
for this, and I didn't want to depend on any other libraries or have to define unzip
to make this code work.
@phadej I don't think I'm against that... but I have doubts about a class method that doesn't make much sense for a pretty large category of instances. It makes me nervous, anyway. I'd be substantially less nervous about adding one for Traversable
.
But Traversable
isn't quite right either, is it, because that requires an ordering. Perhaps morally it belongs in some class between? Or does unzipping IO
make more sense than I'm giving it credit for?
Cameras
is a two-tuple as a functor
Do you mean type Camera a = (Something, a)
and you are using unzip :: (a, (b, c)) -> ((a, b), (a, c))
? Otherwise I don't understand what you mean by "two-tuple as a functor".
I would like to see more evidence that unzip
is used at general Functor
types before considering this for inclusion in base
(notwithstanding that the general version shouldn't have been added to Data.List.NonEmpty
in the first place).
Do you mean type Camera a = (Something, a) and you are using unzip :: (a, (b, c)) -> ((a, b), (a, c))? Otherwise I don't understand what you mean by "two-tuple as a functor".
No, sorry. I mean:
data Cameras a = Cameras{ upCamera, downCamera :: a }
I see, so you are using unzip
at a type isomorphic to unzip :: ((a, b), (a, b)) -> ((a, a), (b, b))
. This fits with @treeowl's hypothesis that it is sensible to require at least Traversable
. Perhaps it should be Traversable
and Applicative
. Then it's a sort of "transpose" operation, I think.
Other definitions of this function:
Thanks! Definitions are somewhat persuasive. Actual usages (particularly at the fully-general type) would be even more persuasive.
unzipWith :: (c -> (a, b)) -> f c -> (f a, f b)
Is slightly more general, and composes better. (Think a variant for Compose).
More general in that it can be specialized to unzip = unzipWith id
, but actually more specific when it comes to available instances.
There are types that can be unzipped but not mapped over. For instance, if there are invariants on the values or on the relationship between them.
data Monotonic a = Ord a => Monotonic (Vector a)
monotonic :: Vector a -> Maybe (Monotonic a)
monotonic = ...
unzip :: (Ord a, Ord b) => Monotonic (a,b) -> (Monotonic a, Monotonic b)
unzip (Monotonic a) = bimap Monotonic Monotonic (Vector.unzip a)
instance TypeError ('Text "Cannot perform arbitrary point-wise operations") => Functor Monotonic where
fmap = error "Impossible"
Ah hmm, that's actually a wrong example since Ord (a,b)
allows for the b
side to be non-monotonic.
Maybe a better example would be something like data RNF a = NFData a => RNF a
.
The idea being that a fully evaluated pair must have fully evaluated components, and we can't map over it via Functor
for the same reasons as Set
.
@alexfmpe Sure. But as you mention yourself, these types are not Functor
. So unzip x = (fst <$> x, snd <$> x)
default implementation wouldn't work for them. And having Unzip f
as a superclass of Functor
won't fly, as it breaks almost everything.
There is one potential complication with making it a Traversable
method: the default lazy implementation can't be written in Report Haskell without unsafeCoerce
—it seems to need either that or GADTs. The default fully eager implementation is likely only desirable in most circumstances for rigid-spined types (maps, hash maps, arrays, and such).
@treeowl which Traversable
default lazy implementation. You never wrote it down in this issue (nor described how it's different (& better?) than unzip x = (fst <$> x, snd <$> x)
@phadej ,
unzipWithLazy :: Traversable t => (a -> (b, c)) -> t a -> (t b, t c)
unzipWithLazy = Data.Biapplicative.traverseBia
The above assumes the Biapplicative (,)
instance in the master branch of bifunctors
. In the current Hackage version you'd need a bit of newtype
trickery to get the right behavior.
The advantage over the simple fmap
thing: for lazy-spined types, the two results are built "in parallel", (likely) avoiding leaking first components into second components or vice versa.
@treeowl is that the same as
-- | 'unzip' transforms a list of pairs into a list of first components
-- and a list of second components.
--
-- >>> unzip []
-- ([],[])
-- >>> unzip [(1, 'a'), (2, 'b')]
-- ([1,2],"ab")
unzip :: [(a,b)] -> ([a],[b])
{-# INLINE unzip #-}
-- Inline so that fusion `foldr` has an opportunity to fire.
-- See Note [Inline @unzipN@ functions] in GHC/OldList.hs.
unzip = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
when specialized t = []
? It's difficult to understand what traverseBia
does, and in fact bifunctors
tries to rewrite it (and bunch of other specializations respectively) to
-- Rewrite rules for traversing a few important types. These avoid the overhead
-- of allocating and matching on a Mag.
{-# RULES
"traverseBia/list" forall f t. traverseBia f t = traverseBiaList f t
#-}
traverseBiaList :: Biapplicative p => (a -> p b c) -> [a] -> p [b] [c]
traverseBiaList f = foldr go (bipure [] [])
where
go x r = biliftA2 (:) (:) (f x) r
That looks like specialized unzip
from Data.OldList
, doesn't it?
I fail to grasp a benefit of having unzipWithLazy = Data.Biapplicative.traverseBia
over unzipWith
as a member of Functor
. There aren't that many lazy-spined types, that special implementations for them cannot be written by hand, without overly-complicating things.
All this discussion just makes me think: let's just add what @ocharles proposed, and let you & me discuss options as our own proposals if we think if just having unzip
in Data.Functor
(as it's already in Data.List.NonEmpty
) is not good enough.
The original proposal is very simple. I suggest @ocharles to trigger a vote.
I don't actually think non-traversable functors necessarily have sensible unzips.
unzip :: (c -> (a,b)) -> (c -> a, c -> b)
seems perfectly sensible (even canonical) to me, despite (->) c
not being traversable.
@phadej , it's not quite the same, no. The traverseBia
one is lazier; unzip [undefined, undefined]
will produce ([undefined, undefined], [undefined, undefined])
rather than undefined
. I believe that semantically we have
unzipUsingTraverse xs = xs `seq` unzipUsingFmap xs
but they're operationally different. I agree this discussion should be continued elsewhere.
Similarly, there are lots of other representable functors that are not traversable that have sensible unzips:
Moore c (a,b) -> (Moore c a, Moore c b)
Mealy c (a, b) -> (Mealy c a, Mealy c b)
etc.
Then there are compositions of traversable things and representable things that are neither, which have sensible unzips, etc. etc. etc.
Functor
captures that sort of broader space. Admittedly if you move an overloadable method into the Functor
class you are making everyone pay for it and hoisting users on the horns of a dilemma of "do I increase the requirements on my Functor
instance or do I accept a suboptimal 'unzip'?"
@ekmett , I didn't think about the extra constraint issue. That's pretty nasty.
There's also Unzip (Event t)
instance in reflex
, with unzip = splitE
: https://hackage.haskell.org/package/reflex-0.8.2.1/docs/Reflex-Class.html#v:splitE
And that's actually the only use-case why I'm not removing Unzip
class from semialign
(or restructuring the hierarchy there, semialign
works now, and there is very little motivation to change it).
Btw. Prelude
exports unzip :: [(a,b)] -> ([a], [b])
. If unzip
is added to Functor
, it would make sense to export the Functor
version instead, but if it's just defined as a unzip x = (fst <$> x, snd <$> x)
then the Prelude
version should stay list-specific (a pity).
This sounds like this should be funzip
or something to avoid clash on unqualified import of Data.Functor unless you move it into the class where it can be overridden to suit.
There are indeed plenty of unqualified Data.Functor imports: https://hackage-search.serokell.io/?q=import%5Cs*Data.Functor%24
Hard to say how many of them use unzip
though.
If Data.OldList
unzip is changed as
-let unzip = foldr (\ (a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
+let unzip = foldr (\ ~(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
(which would be own issue)
Then
>>> bimap length length $ unzip [undefined, undefined]
(2,2)
it would be as lazy as traverseBia
variant.
I'm not competent enough to comment what are implications of that change.
@ocharles, if you will be triggering a vote, shall we collaborate on an MR to close #86 simultaneously? I find it hard to imagine there will be people in favour of one but not the other.
There's also Unzip (Event t) instance in reflex, with unzip = splitE: https://hackage.haskell.org/package/reflex-0.8.2.1/docs/Reflex-Class.html#v:splitE
Another example is Lindsey Kuper's LVish, where the monad would admit an unzip that creates a "future" for the computation, and returns two different computations that each force the future and then apply fst or snd to the result.
@mixphix @ocharles As a point of information, in case it changes the design you come up with, I give fair warning that I will vote against any proposal that breaks existing consumers of Data.List.NonEmpty.unzip
without a deprecation cycle. One such cycle (and the one I wrote about at Opaleye's API breakage policy) is
n
:
Data.List.NonEmpty.unzipNonEmpty
, specialised to the NonEmpty
type.Data.Functor.unzip
at general typeData.List.NonEmpty.unzip
to say that it will be deprecated in GHC version n+1
and to use Data.List.NonEmpty.unzipNonEmpty
or Data.Functor.unzip
n+1
:
{-# DEPRECATE #-}
Data.List.NonEmpty.unzip
n+2
:
Data.List.NonEmpty.unzip
Data.List.NonEmpty.unzipNonEmpty
(in which case document Data.List.NonEmpty.unzipNonEmpty
as obsolete and that Data.List.NonEmpty.unzip
should be preferred instead)I am open to alternative forms of deprecation cycle but we need something that ensures users get a deprecation warning before their code breaks. Is this very heavyweight? Yes. Is this very annoying for those of us who want to move fast and fix things? Also yes. Is it essential if we want Haskell to be taken even minimally seriously in the backwards compatibility stakes? Even more yes.
(Procedural note: I don't believe the CLC can actually vote for 1, 2 and 3 at the same time in practice. I suspect we'd only be able to vote on 1, and then later 2, and then later still 3.)
I find that deprecation cycle worse than e.g. Data.List
monomorphisation process.
unzipNonEmpty
which is going to be obsolete.The Data.List
monomorphization way of "Just make patches for everything which breaks (in Stackage)" is considerably less polluting, and in this case shouldn't even need much work. I'm sure @mixphix is happy to gauge the impact.
I haven't given you a proposal yet, but maybe it's time for me to get to that. The TL;DR of my proposal would just be to add unzip
to Data.Functor
though. It's a purely additive change
@ocharles Sorry for playing a devils advocate.
See comments https://github.com/haskell/core-libraries-committee/issues/88#issuecomment-1247050108 and https://github.com/haskell/core-libraries-committee/issues/88#issuecomment-1247054109
There is a chance of a name clash. Probably rare, but given that Data.Functor
is likely imported unqualified, I'd say it should be assessed. (Do you hide unzip
from Prelude
in your work project?)
Do you hide unzip from Prelude in your work project?
Yes. Perhaps more extremely, I am explicit with my imports (never unqualified), and when I have to hide things from Prelude I actually go as far as NoImplicitPrelude
and import everything I need (that is, I don't use hiding
and I don't use unqualified imports. Thank you HLS!)
There is a chance of a name clash. [..] I'd say it should be assessed.
How does one do such an assessment? Should I make the change and rebuild Hackage?
@ocharles
How does one do such an assessment? Should I make the change and rebuild Hackage?
s/Hackage/Stackage/
See https://github.com/Bodigrim/clc-stackage That is for GHC-9.2. So yes, build a GHC-9,2 with your change, and then build that project. It should work with cabal build -w /path/to/your/ghc/_build/stage1/bin/ghc
.
I find that deprecation cycle worse than e.g. Data.List monomorphisation process.
You're welcome to find it as you like. I'm just making my position clear in advance to the proposers to avoid unwelcome surprises.
@ocharles while it is your right as a proposer, I'd strongly suggest against triggering a vote after two days only. However, you are not obliged or expected to react to all suggestions in the thread. Once ready, please prepare a GHC MR with desired changes.
I usually do not require impact analysis for additive changes, but it strengthens the proposal and eliminates doubts. Building https://github.com/Bodigrim/clc-stackage takes a day or two on my laptop, but besides that is a fairly automatic process. See instructions inside the cabal file.
@phadej
But as you mention yourself, these types are not Functor. So unzip x = (fst <$> x, snd <$> x) default implementation wouldn't work for them. And having Unzip f as a superclass of Functor won't fly, as it breaks almost everything.
Right. I guess what I was trying to get at is that neither fmap
nor unzip
are stronger than each other per-se (at least if we take invariants into account), but unzipWith :: (c -> (a, b)) -> f c -> (f a, f b)
allows both
fmap f = fst . unzipWith (dup . f)
unzip = unzipWith id
though whether that fmap
is lawful depends on the laws we add to unzipWith
. I assume unzip
would trivially be lawful.
Now, a default implementation of unzip
in terms of fmap
does allow for
unzipWith f = unzip . fmap f
so in the specific case of functor-based unzipping, it seems going with unzip
or unzipWith
is equivalent.
For a Functor
class method (which I'm nervous about), unzipWith
seems rather better. For a Traversable
method (less nervous), one option is
unzipUsing :: Applicative f => (a -> f (b, c)) -> t a -> f (t b, t c)
A less conservative thing would be to move Biapplicative
to base
and make traverseBia
a Traversable
method.
@ocharles if you made you mind, please raise a draft MR, so that we know what exactly we are voting on.
@Bodigrim Will do, thanks for the nudge.
Edit: to clarify, do you mean raising a draft MR against base
with my proposed change?
@Bodigrim Done: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9434
Currently
Data.List.NonEmpty
defines:But this is a very strange place for such a generally useful function.
I'd like to propose that
Data.Functor
gains this function. I don't feel strongly on whether it should be a new definition, nor what should happen withData.List.NonEmpty.unzip
, only that after this proposal I can do: