haskell / core-libraries-committee

95 stars 16 forks source link

Explicitly define `length`, `elem`, etc. in `Foldable` instance for `Data.Functor.Compose` #57

Closed cdsmith closed 1 year ago

cdsmith commented 2 years ago

Currently, the Foldable instance for Data.Functor.Compose looks like this:

instance (Foldable f, Foldable g) => Foldable (Compose f g) where
    foldMap f (Compose t) = foldMap (foldMap f) t

All other methods besides foldMap get default implementations, which delegate to foldMap. I propose that the following should be added:

    length (Compose t) = getSum (foldMap' (Sum . length) t)
    elem x (Compose t) = getAny (foldMap' (Any . elem x) t)
    sum (Compose t) = getSum (foldMap' (Sum . sum) t)
    product (Compose t) = getProduct (foldMap' (Product . product) t)
    ... and so on

The inner container in a Compose may have a more efficient implementation for other methods of Foldable. In case it does, that should be used instead of falling back to foldMap as happens now.

This applies to all members of Foldable, which presumably are in the class precisely because they can have more efficient implementations. It seems particularly troublesome for length and elem, which are likely to have much more efficient implementations for commonly used containers.

treeowl commented 2 years ago

I think it should be foldMap'.

cdsmith commented 2 years ago

Yes, you are right. I edited my proposal to use foldMap' for length.

cdsmith commented 2 years ago

On further reflection, I realized that essentially the same situation applies to elem and other members, so I have updated this issue to also include them.

Bodigrim commented 2 years ago

@cdsmith could you please put up a draft MR?

cdsmith commented 2 years ago

@Bodigrim Is libraries/base at https://gitlab.haskell.org/ghc/ghc the right place to do so?

Bodigrim commented 2 years ago

Yes

cdsmith commented 2 years ago

I probably cannot put up a draft MR in a timely fashion. I spent some time this evening trying to figure out how to get that checked out and building. Among the things that stumped me were how to get HLS working (it complains about missing a file related to hadrian), how to build just base (instead of rebuilding all of GHC and everything), where to put tests (I can barely find any tests in libraries now)... hopefully there's someone else interested in this that is already familiar with developing in base, and for whom the hours it would take me to figure this stuff out would not be required.

Bodigrim commented 2 years ago

@cdsmith you do not even need to build GHC yourself, CI will do it for you; and at this stage tests are not obligatory.

Bodigrim commented 2 years ago

@cdsmith unless there is a progress within two weeks, I'll close this as abandoned. If anyone else has bandwidth to create an MR, feel free to do so.

cdsmith commented 2 years ago

I will try to work on this over the weekend. It would be a shame to lose what seems like a clear improvement, and I should get better at working in the GHC tree anyway.

Bodigrim commented 2 years ago

@cdsmith ping. Feel free to advertise this job on irc / twitter / discourse / reddit, but if there is no MR very soon, I'll close. https://github.com/haskell/core-libraries-committee/blob/dcaf3f217d286e046494d7c48deb7ca8d46f0c1d/PROPOSALS.md?plain=1#L54-L57

cdsmith commented 2 years ago

Is there a more traditional issue tracker for base somewhere where I can file this, then? I'm not trying to force anyone else to implement anything, but if this were any other library, I'd think it's worth having an open issue. I cannot seem to find such a thing for base.

Bodigrim commented 2 years ago

Dunno, you can probably open an issue at GHC bug tracker, linking this discussion.

What's your main friction point here? You can copy the source of Data.Functor.Compose into your own Compose.hs, write instance Foldable and check locally that the file produced is syntactically correct. Then implant it blindly into GHC source tree, raise an MR and wait for CI results. You don't need to build GHC locally.

Bodigrim commented 2 years ago

I posted a call for help: https://www.reddit.com/r/haskell/comments/xec5xj/explicitly_define_length_elem_etc_in_foldable/

ramirez7 commented 2 years ago

I can take a stab at implementing this. Compose has served me well for years, so I'd love to give it some love :v:

Bodigrim commented 2 years ago

@ramirez7 please go ahead with an MR.

ramirez7 commented 2 years ago

Here is the MR. It is the code in this proposal verbatim.

@cdsmith I had never made any modifications to GHC before (only built forks from source), so I wanted to share what I learned that made developing this change in-repo easy:

cdsmith commented 2 years ago

@ramirez7 Thank you so much! I had intended to get to this, but it was becoming a source of a lot of stress for me, so you have no idea how grateful I am that you checked it off my list for me. :)

Bodigrim commented 2 years ago

Thanks @ramirez7, much appreciated.

@phadej raised a good point about definitions for partial methods. I’m ambivalent here: I’m ready to approve the proposal as is, but also fine with leaving partial methods to default implementations. What are others thinking?

Ericson2314 commented 2 years ago

Good question. It's very hard to tell what the ramification are!

(Regardless of the fate of head and tail, it would be nice to warn or deprecate those methods in lieu off Foldable1!)

ramirez7 commented 2 years ago

I think minimum and maximum should definitely leverage inner optimizations for Compose. Those frequently have big asymptotic differences.

I could go either way on the 1s. I implemented them mostly for completeness and as an exercise.

ramirez7 commented 2 years ago

Do I think minumumMay and maximumMay should be the real class methods? Sure - but that would be a different proposal entirely 😁

phadej commented 2 years ago

My point is: don't spend effort on minimum etc in Foldable. Rather improve minimum etc in Foldable1.

(FWIW, Foldable1 had tests: https://github.com/phadej/foldable1/blob/master/test/Tests.hs, but introducing that to base had lost that battery :( )

cdsmith commented 2 years ago

Is there anything wrong with @ramirez7's implementations? If not, then there's not really any effort remaining to be spent, and there's definitely an opportunity to benefit with, for example, minimum in a Compose f Set. Doing the same for Foldable1 sounds like a great idea.

Bodigrim commented 2 years ago

I tend to agree with @cdsmith here: since @ramirez7 has already spent a considerable effort on implementing partial members of Foldable and they are strictly better than default implementations, let's go. Implementing instance Foldable1 Compose would be a topic for another proposal.

Dear CLC members, before we vote on this, please take a moment to look at the MR: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9001/diffs, as it offers an opportunity for a technical review and nitpicking. @tomjaguarpaw @chessai @cgibbard @mixphix @emilypi

phadej commented 2 years ago

has already spent a considerable effort on implementing partial members of Foldable

Sunken-cost fallacy.

There is no tests nor benchmarks.

Consider Compose [] [], and foldl1.

There is good foldr:

foldr f b (Compose fga) = foldr (\ga acc -> foldr f acc ga) b fga

foldr1 is proposed to be defined as

    foldr1 f (Compose fga) =
      fromMaybe (error "foldr1: empty structure") $
        foldr (\ga acc -> if null ga then acc else Just $ let a = foldr1 f ga in maybe a (a `f`) acc) Nothing fga

is it really better than the default

    foldr1 :: (a -> a -> a) -> t a -> a
    foldr1 f xs = fromMaybe (errorWithoutStackTrace "foldr1: empty structure")
                    (foldr mf Nothing xs)

Note that inner foldr for inner [] already does if null ga then acc else Just $ foldr1 ... conditional, so IMO the specialized version is just an extra code which people have to test. (which they don't).

phadej commented 2 years ago

Do I think minumumMay and maximumMay should be the real class methods? Sure

These couldn't have efficient implementations for any container. minimum and maximum can e.g. for Set. I think that's the reasoning while minimum and maximum are Foldable methods but *By aren't.

sjoerdvisscher commented 2 years ago

@phadej, @ramirez7 said *May not *By.

mixphix commented 1 year ago

This one seems to have slipped under the radar. It seems we never actually ran a vote!

chessai commented 1 year ago

There is no tests nor benchmarks.

I would also like to see some before voting on the MR.

Consider Compose [] [], and foldl1.

There is good foldr:

foldr f b (Compose fga) = foldr (\ga acc -> foldr f acc ga) b fga

foldr1 is proposed to be defined as

    foldr1 f (Compose fga) =
      fromMaybe (error "foldr1: empty structure") $
        foldr (\ga acc -> if null ga then acc else Just $ let a = foldr1 f ga in maybe a (a `f`) acc) Nothing fga

is it really better than the default

    foldr1 :: (a -> a -> a) -> t a -> a
    foldr1 f xs = fromMaybe (errorWithoutStackTrace "foldr1: empty structure")
                    (foldr mf Nothing xs)

Note that inner foldr for inner [] already does if null ga then acc else Just $ foldr1 ... conditional, so IMO the specialized version is just an extra code which people have to test. (which they don't).

I agree that there's not really a reason to approximate the wheel here, and we should just remove the unnecessary specialised version

Bodigrim commented 1 year ago

@ramirez7 I left suggestions in the MR. Please signal if you prefer someone to take over.

Bodigrim commented 1 year ago

In the interest of reaping some fruits of the long discussion here and as an exception to general practices, I rebased the MR of @ramirez7 in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10145 and applied suggestions from comments above. I don't have any more time to spend on this, so I'd like to trigger a vote as is. I'm fine if someone votes -1 on the basis of insufficient effort, I'm neither the original proposer nor the original implementer.

Dear CLC members, let's finish this up one way or another. The proposal is to define methods of instance Foldable (Compose f g) explicitly so that they can benefit from efficient implementations of relevant methods in underlying Foldable f and Foldable g instead of going all the way through foldMap as the default implementation does. The MR is https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10145, and an earlier discussion of the chosen implementation can be found in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9001. This is a backwards-compatible change.

CC @tomjaguarpaw @chshersh @mixphix @parsonsmatt @angerman @hasufell


+1 from me.

hasufell commented 1 year ago

I'm fine if someone votes -1 on the basis of insufficient effort, I'm neither the original proposer nor the original implementer.

So who will take the MR over the finish line?

Bodigrim commented 1 year ago

If MR !10145 gets approved, I’ll take it over the finish line myself. The only thing it lacks is a changelog.

hasufell commented 1 year ago

If MR !10145 gets approved, I’ll take it over the finish line myself. The only thing it lacks is a changelog.

In that case I'm +1

mixphix commented 1 year ago

+1

chshersh commented 1 year ago

+1

Thanks, @Bodigrim, for pushing the proposal over the finish line! 👏🏻


In the future, I suggest to label proposals as abandoned and close them if the original author doesn't have the capacity for implementing the MR and steering the proposal till the end.

I don't think that implementation of proposals should be a CLC responsibility. I'm okay to make this one an exception because a lot of work already was put into it and only CHANGELOG entry is required (and Andrew kindly offered his valuable time to finish the PR).

tomjaguarpaw commented 1 year ago

+1

Bodigrim commented 1 year ago

Thanks all, 5 votes out of 7 are enough for approval.

chshersh commented 1 year ago

I'm trying to summarise the state of this proposal as part of my volunteering effort to track the progress of all approved CLC proposals.

Field Value
Authors @cdsmith, @ramirez7, @Bodigrim
Status merged
base version 4.19.0.0
Merge Request (MR) https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10145
Blocked by nothing
CHANGELOG entry present
Migration guide not needed

Please, let me know if you find any mistakes 🙂