Closed mixphix closed 1 year ago
I would want to change the implementation as well, so as not to be gratuitously lazy (and memory leaky).
unzip ((a, b) :| abs) = (a :| as, b :| bs)
where
(as, bs) = Data.List.unzip abs
Many of the other functions in that module are defined using lazy pattern matching ~(x :| xs)
. Why pick unzip
to be more strict than its relatives?
Many things in that module may be lazier than they should be. I haven't looked at it recently. The current unzip
seems pretty awful, but maybe there should be a separate proposal to stricten everything up?
Data.List.NonEmpty
API is extremely lazy, but let's leave this for another proposal.
> Data.List.NonEmpty.map id undefined `seq` ()
()
I'm hesitant to remove this function completely, though. It's already here and useful. I wonder if it would be better to export this function from Data.Functor
? A Hackage search shows a few duplicates, indicating a level of utility.
I opened https://github.com/haskell/core-libraries-committee/issues/88 today (completely independently of this - weird timing!). That proposal discusses having Data.Functor.unzip
Haha, well in any case, they seem to be in tandem with each other :)
overgeneralized functions
I don't understand why this is a bad thing. With the exception of functions whose sole purpose is to restrict types (asTypeOf
), why would you ever want a less general function when you could have an equivalent more general one instead?
As @treeowl mentioned, strictness is one factor. Another is performance, though here it may not apply: for specific types one might be able to write it more efficiently than the general abstraction could perform, since the latter mustn't assume anything about the representation.
Since I was being a bit vague above.... For []
, say, we have
traverseBia :: Biapplicative f => (a -> f b c) -> [a] -> f [b] [c]
traverseBia _ [] = bipure [] []
traverseBia f (x : xs) = biliftA2 (:) (:) (f x) (traverseBia f xs)
In particular,
traverseBia :: (a -> (b, c)) -> [a] -> ([b], [c])
The Biapplicative
instance for (,)
is still a mess on Hackage (not matching the Bifunctor
instance), but in the next release they'll both be lazy. One could also write a custom pair type with stricter versions. Anyway, we'll have
bipure = (,)
biliftA2 f g ~(a, b) ~(c, d) = (f a c, g b d)
Inlining,
traverseBia :: (a -> (b, c)) -> [a] -> ([b], [c])
traverseBia _ [] = ([], [])
traverseBia f (x : xs) = (fx1 : r1, fx2 : r2)
where
(fx1, fx2) = f x
(r1, r2) = traverseBia f xs
Walking one of the result lists will cause the other one to be realized, with selector thunks for elements (assuming the optimizer doesn't do anything horrible). Memory leak averted, despite remaining very lazy. It'll still be quite a bit lazier than the hand-written unzip
, but I think it's about the only reasonable way to write a lazy unzip
for an arbitrary Traversable
. Of course, if you want a totally eager unzip, you can do that too, by using a custom pair type with strict Bifunctor
and Biapplicative
instances.
To sum up my position, I think Data.Functor
, Data.Traversable
, and Data.NonEmpty
are all good places for unzip
, each with a different implementation.
FWIW, I would be fine to have all three.
Also,
why would you ever want a less general function when you could have an equivalent more general one instead?
More specific types can lead to more specific type errors.
I think this is a corollary of #22/#76/... When that is done, this could be done as well.
To sum up my position, I think
Data.Functor
,Data.Traversable
, andData.NonEmpty
are all good places forunzip
, each with a different implementation.
While I see the merits of each of these versions of unzip
on their own, having them all seems like a namespacing nightmare.
I described the conditions under which I'm prepared to vote in favour of this breaking change on a related discussion
Since this is a breaking change, an impact assessment is due. It's quite possible that someone exported Data.List.NonEmpty.unzip
and used it with another Functor
.
It's quite possible that someone exported Data.List.NonEmpty.unzip and used it with another Functor.
Here is a real world example in non-public code: https://github.com/haskell/core-libraries-committee/issues/88#issuecomment-1246483695
@mixphix are you still interested in this? Could we make some progress towards impact assessment?
I'd like to take this proposal over, if you're stuck/need help @mixphix. Is the next step still the impact assessment?
Yes, impact assessment.
If you wouldn't mind, @ocharles, I'd be grateful to hand this proposal over to you.
You got it! I'll try and put together an impact assessment in the next few days
If I did everything correctly, I believe I've completed the impact assessment. I used https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9757 to build a new version of GHC, then built clc-stackage
without needing to patch any libraries*.
So, to reiterate: the proposed change doesn't seem to break anything in clc-stackage
. I'd like to submit the proposal (precisely what's in the above MR) to the committee.
Thanks, @ocharles. I'll trigger the vote once a new cast of CLC is selected in February.
@Bodigrim Good to go with a vote now?
Dear CLC members, the proposal is to restrict Data.List.NonEmpty.unzip
from unzip :: Functor f => f (a, b) -> (f a, f b)
to a more monomorphic unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b)
. The impact assessment found no Stackage packages affected by this change.
Strategically the proposed change is in line with the monomorphisation of Data.List
, approved earlier.
CLC has approved addition of polymorphic unzip
to Data.Functor
in #88, but this change happened to late to be included in base-4.18
to be shipped with GHC 9.6.
Before we trigger a vote, I would like to hear (non-binding) opinions from new members of CLC. CC @chshersh @angerman @parsonsmatt @hasufell
It's a gratuitous breaking change, so -1 from me, as outlined in a previous comment.
[EDIT: oh sorry, my opinion wasn't actually requested at this time, but FWIW that's my opinion, and my vote]
What worries me in the current plan is that if anywhere out there exists a consumer of polymorphic unzip
from Data.List.NonEmpty
, they won't be able to maintain backward compatibility without CPP, because Data.Functor.unzip
has not been in base
for long enough (in fact, it is not in base
yet ;).
https://github.com/haskell/core-libraries-committee/issues/88#issuecomment-1247093730 describes a possible approach to gradual replacement, but I think inroducing Data.List.NonEmpty.unzipNonEmpty
only to deprecate it later is too grotesque for the task at hand.
If there is an appetite for a smoother transition, I'd suggest the following:
Data.Functor.unzip
(already approved, @ocharles please land the MR) and adds a warning to Data.List.NonEmpty.unzip
, suggesting to use Data.Functor.unzip
instead.Data.List.NonEmpty.unzip
to f ~ NonEmpty
and removes the warning.This is slightly antithetical to the purpose of the proposal, because until GHC 9.10 users would not have access to a monomorphic unzip
, but this is no worse than status quo.
What does someone do if they want to write code using polymorphic unzip
that works across three versions, 9.4 to 9.8, let alone back to 8.10 (which is still widely used in practice)? I don't see any strong reason to force monomorphization of unzip
. I suggest we just keep it there with a deprecation warning. It's not doing anyone any harm.
Reading through this thread, I find the justification for this change rather weak. It also comes with not deprecation as far as I can see. This thread got me wondering how a proper deprecation warning for this might look. We really only want to inform users of the "overgeneralised" use case, not those who use the function in its proposed monomorphised form. Lacking a better (usage dependent) deprecation, I'd argue that this should first be implemented with a regular deprecation warning, and the suggestion on how to migrate, before being restricted. Again ideally we'd have some deprecation based on actual use.
With https://github.com/ghc-proposals/ghc-proposals/pull/575, you could have use-dependent deprecation:
unzip :: UnzipDeprecation f => Functor f => f (a,b) -> (f a, f b)
unzip xs = (fst <$> xs, snd <$> xs)
class UnzipDeprecation f
instance UnzipDeprecation f {-# DEPRECATED "..." #-}
instance {-# OVERLAPPING #-} UnzipDeprecation NonEmpty
Edit: However, see Oleg's comment here and Tom's followup https://github.com/ghc-proposals/ghc-proposals/pull/575#issuecomment-1435741892 i.e. this wouldn't actually be backwards compatible, though if one uses INCOHERENT
instead of OVERLAPPING
it might be
Ok! That is some creative idea!
I think the main justification is that we don't really want that implementation.
Data.List.NonEmpty
API is extremely lazy, but let's leave this for another proposal.> Data.List.NonEmpty.map id undefined `seq` () ()
My point is that this should not be left for another proposal.
The only interesting reason to change the type signature is to have a more specialized implementation. That would be a real motivation.
Without that, it's not enough for me and will likely get a -1.
If someone comes up with an interesting implementation that we are excited about, then changing the type signature becomes feasible. Doing it prematurely is a no-go for me and is just a cosmetic change.
So I kinda agree with @treeowl here. In case we do want a different implementation, I'd possibly argue for renaming the function and never renaming it back... completely removing Data.List.NonEmpty.unzip
forever. But I could possibly be persuaded otherwise wrt deprecation strategy.
My point is that this should not be left for another proposal.
The extreme laziness of Data.List.NonEmpty
is tracked in #107.
My point is that this should not be left for another proposal.
The extreme laziness of
Data.List.NonEmpty
is tracked in #107.
Yes. My point is that my +1 on function signature change would be dependent on whether we accept a concrete implementation change. And that can be done as part of that ticket.
I would suggest Data.Functor.unzip
be backported to older compilers, which could help with the depreciation/migration story.
This is a tough one 🥴
We (CLC) need to come up with a policy for introducing breaking changes. It's not the first proposal that wants to change existing things. And it's not going to be the last. Our lives would become much easier if we provide a prepared answer to the question "What's the deprecation schedule and timeframe for any breaking change?"
We don't have such a helpful document now, so I'll provide my thoughts on this proposal alone.
I'm against this proposal as is because it's a breaking change without any deprecation considerations. Sure, the CLC Stackage impact assessment didn't indicate any breakages. But Stackage is not all Haskell.
It's a good change. It's bizarre that Data.List.NonEmpty
has a single polymorphic function. Besides, as discussed before, making the type more specific enabled writing a more optimal implementation.
In the context of already accepted and implemented #22/#76, making a fuss about making unzip
monomorphic sounds unreasonable.
I would accept this proposal with a deprecation schedule. Sure, it would be cool to have fancy deprecation tools to drive the change. But we don't have anything atm. So my alternative to what @Bodigrim suggested:
GHC 9.8
: Data.Functor.unzip
is implemented, Data.List.NonEmpty.unzip
is deprecated, mentioning in the deprecation message that this function type will change in GHC 9.14GHC 9.10
: NothingGHC 9.12
: NothingGHC 9.14
: Data.List.NonEmpty.unzip
can be made monomorphic and the deprecation warning can be removedThis schedule ensures enough time to adjust to the change and allows supporting the three latest major GHC versions without CPP.
We could become super conservative and just deprecate Data.List.NonEmpty.unzip
without ever touching it. Honestly, I really doubt that this function is used in performance-critical code (or any code at all, tbh). So there's no real benefit in changing it, we could just make Haskell more stable and reliable by not changing this function at all. I don't think that having a single deprecated function in Data.List.NonEmpty
forever is that big of a deal.
But Haskell is always going to change things, and I'm not prepared to push for extreme stability right now.
I feel like unzip :: (Functor f) => f (a, b) -> (f a, f b)
is a "bad operation" in the same way that List.lookup
is a "bad operation" - the default implementation is definitely inefficient, requiring two passes over the input, and it's probably not what you want.
List.unzip
and NonEmpty.unzip
make sense - they can be done with a single pass over the input list, and it's somewhat common to zip
and unzip
on these structures. But I wouldn't want to unzip
a Map k (a, b)
, or unzip
an IO (a, b)
, or unzip
a Maybe (a, b)
.
So, I think I am mildly -1 on Data.Functor.unzip
- if we want to zip/unzip as a polymorphic abstraction, I'd rather see a class Zip
that managers that, rather than bolting the functionality on to Functor
, where it will always be pessimal.
For the same reason, I'm +1 on Data.List.NonEmpty.unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b)
.
Given that the clc-stackage
didn't find any breakage, and that this is a polymorphic function exposed from a module designed to be imported qualified, I'd expect most uses of this to be NonEmpty.unzip
or similar. I'd be fine with a faster migration path - possibly even a single release, 9.8 with a WARNING
notice ("this is going to be monorphized in the future"), and then 9.10 monomorphizing it.
@ocharles since you took over the proposal, what would you like to do? We can vote on the original proposal, but judging from the discussion above this is likely to fail. I think that changing the proposal to a warning now and monomorphisation later (say, in GHC 9.14) has much better chances to succeed.
I would like to vote with the original proposal. Personally with my stackage evaluation I don't see any significant breakage, but the CLC are of course free to see things differently. I don't think I have the resources to do much more than that though, and took this from @mixphix to try and keep things moving forward
I'm obviously in favour of the proposal as written; preferably with @parsonsmatt's fast-tracked warning/restriction two-release plan, but I'll gladly accept any schedule for the sake of a more sensible type.
@Bodigrim could you remove "awaits-impact-assessment", as that has been done (https://github.com/haskell/core-libraries-committee/issues/86#issuecomment-1397145338)
I'll gladly accept any schedule for the sake of a more sensible type.
I'm going to use your permission and put for a vote a plan from https://github.com/haskell/core-libraries-committee/issues/86#issuecomment-1435993821 as it's likely to gain more support.
Dear CLC members, let's vote on the following plan:
Data.List.NonEmpty.unzip
with {-# DEPRECATED unzip "This function will be made monomorphic in GHC 9.14, consider switching to Data.Functor.unzip" #-}
. The latter is already there.Data.List.NonEmpty.unzip
to monomorphic unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b)
and remove the deprecation warning.@tomjaguarpaw @chshersh @hasufell @mixphix @angerman @parsonsmatt
+1 from me. This is a very conservative migration plan for a virtually unused function, eventually bringing in more consistency.
+1 - would also +1 a more aggressive schedule
I don't quite understand point 1. If a user of Data.List.NonEmpty.unzip
uses it on a NonEmptyList
they shouldnt use Data.Functor.unzip
. Should they just disable the depreciation warning? If so, I think we should say that. I say this because I would expect most users are using it on NonEmptyLists
, and the more general type is the exception. The impact assessment seems to corroborate this.
I'm ok with this proposal, but it doesn't sit great because legitimate uses will have to turn the warning off, but that's (at least) module wide, and while you get a warning about the deprecation, you get no such warning to stop ignoring deprecations when you reach 9.14.
As there is no suggestion for an implementation change, I find the motivation insufficient and vote -1.
+1
The deprecation schedule indeed may look too conservative. But I think it would be huge success if we could enforce and follow the same schedule for every breaking change.
I also wouldn't block this improvement on the grounds of "not doing enough". If someone in the future wants to create a proposal to change the implementation of unzip
for NonEmpty
, they're more than welcome to do so 😌
+1
However, I really think we should also add Data.List.NonEmpty.unzipNonEmpty
with a specialized implementation at the same time, and then later do Data.List.NonEmpty.unzip = unzipNonEmpty
.
I'm sympathetic to @hasufell's point of view, but I still think that it's better to bring it to the attention of users that Data.List.NonEmpty.unzip
is currently not a specialized implementation.
The proposal to strengthen to unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b)
will not specialize the implementation in any way. There are other places for similar unzip
functions to live that may have different implementations, but are functionally the same when specialized to NonEmpty
. The names can happily coexist.
Currently we have
unzip :: Functor f => f (a, b) -> (f a, f b)
exported fromData.List.NonEmpty
. The implementation can remain the same, but I propose that the type should be restricted tounzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b)
for analogy withData.List
as well as to avoid exporting overgeneralized functions from specific modules. Really, why should we be exporting something that works with anyFunctor
fromData.List.NonEmpty
?