haskell / core-libraries-committee

95 stars 15 forks source link

Re-export build from GHC.List #29

Closed phadej closed 2 years ago

phadej commented 2 years ago

This would resolve https://gitlab.haskell.org/ghc/ghc/-/issues/19127 issue

GHC.Exts.build should be exported from a module marked {-# LANGUAGE Safe #-} (or Trustworthy)

Otherwise it forces all downstream users to smash {-# LANGUAGE Trustworthy #-} on what is perfectly safe (inferrable) modules.

GHC.List seems a natural place for build.

An alternative is a new module, like GHC.List.Fusion which would be marked Trustworthy, that would be maybe more correct, and not step on Report land, but Data.List has left that already.

MR: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7321

treeowl commented 2 years ago

What about augment?

phadej commented 2 years ago

Same applies to augment, I just weren't even aware about it.

phadej commented 2 years ago

Side note, Data.List monomorphization would be great to get hands on the foldr the rules refer to. Luckily I never needed to refer to that one directly. (Defining rules is forbidden in Safe modules, but IIRC the imported ones still fire?)

treeowl commented 2 years ago

Unfortunately, there's a semantic problem I read about somewhere (I don't remember where). The trouble is seq, which partially breaks the parametricity underlying the validity of foldr/build fusion. Here's a simple example. Suppose we have

q :: (Int -> b -> b) -> b -> b
q c n = c 3 $! n

grum :: [Int]
grum = foldr (:) undefined (build q)

Now build q = [3], so grum = 3 : undefined. But when GHC applies the fusion rule, we get

grum
  = q (:) undefined
  = (:) 3 $! undefined
  = undefined

That's not great behavior for something exposed as Safe Haskell.

phadej commented 2 years ago

I don't understand, where does build is used in that example? Was it supposed to be grum = foldr (:) undefined (build q)?

treeowl commented 2 years ago

@phadej Sorry, yes, I've corrected it.

phadej commented 2 years ago

So one more thing Safe Haskell doesn't play well with. Blargh.

It's still annoying to use Trustworthy just because of build. I guess if I have to chose between Safe Haskell and list fusion, I'll choose the latter.

treeowl commented 2 years ago

It is quite annoying. I didn't really follow any of the local module proposal stuff. Will it become possible to do things like this?

{-# language Safe #-}
module Foo where

f = gump 12

module Bar {-# language Trustworthy #-} where {
  gump :: ...
}
treeowl commented 2 years ago

BTW, I think this is a problem with overly polymorphic seq, not with build.

treeowl commented 2 years ago

It would be nice to have a sort of SafeButFor pragma that explains why a module is not Safe. It would let you indicate language features (RULES, INCOHERENT) as well as the names of Unsafe imports. As long as you use import qualified and don't overlap your as names too much, this should let GHC help you audit the module.

phadej commented 2 years ago

@treeowl do you mean that unrestricted seq, $!, ... shouldn't be allowed in Safe Haskell, (but only their type-class restricted variants)? And that would make this proposal be safe then? Interesting.

treeowl commented 2 years ago

@phadej Yes, that's a good way to put it. I wouldn't expect or want the Eval (or whatever) constraint to be a class, of course; it should be compiler magic like Coercible. But I don't think the powers that be want to go there, so it's just a dream.

treeowl commented 2 years ago

@phadej aside from this corner-case polymorphism business, being able to force functions leads to unpleasant wonkiness around eta expansion through case (see -fpedantic-bottoms and -fno-state-hack). Much better if users know they should wrap up a function (or IO action) in Solo or similar if they want to force some things around it.

nomeata commented 2 years ago

would it make sense to put it into Data.List.Fusion?

phadej commented 2 years ago

Why we need to hide build and augment from beginners?

I don't think that writing toList-like functions using build to make them good producers requires a person to be an expert in Haskell.

nomeata commented 2 years ago

The higher rank type alone makes it rather advanced. It certainly requires a higher level of expertise then, say, sort or intercalate. Right now you have to reach to GHC.Exts to get it, and nobody seemed to mind that.

Also there voiced who say that list fusion is misguided anyways, because it’s fragile, and if one wants streaming, one should reach for a dedicated streaming library. That’s another good reason to keep it out of the more common Data.List; maybe that opinion will prevail in the end and then we will be glad to have list fusion kept separate.

I am myself not sure if it is useful to nudge people to just use build everywhere where they are writing toList and hope that magic happens. You have to understand GHC’s inliner well to set up list fusion properly, possibly with rewriting stuff back if it didn’t fuse, and ideally some tests (e.g. inspection testing).

The more I think about it, the more I like the idea of giving list fusion a proper home (Data.List.Fusion). “Using list fusion” is a different beast than “using lists”.

phadej commented 2 years ago

Right now you have to reach to GHC.Exts to get it, and nobody seemed to mind that.

Nobody minds about Safe Haskell, you mean. That's a fair observation.

tomjaguarpaw commented 2 years ago

@treeowl I too dream of type class seq, for what it's worth. For one thing, it would allow us to be genuinely strictness-polymorphic in a zero-cost way, doing away with the Data.Map.Lazy/Data.Map.Strict distinction, for example.

treeowl commented 2 years ago

@tomjaguarpaw I don't think that's the same issue at all, but please explain why it could be.

Bodigrim commented 2 years ago

I would rather suggest GHC.List or Data.List.Fusion instead of Data.List. The latter is often imported unqualified, clashing with packages, which define build entity locally.

tomjaguarpaw commented 2 years ago

@treeowl Here's a toy example of a strict-in-values (and keys) associative map.

{-# LANGUAGE BangPatterns #-}

import Prelude hiding (lookup)

newtype StrictMap k v = StrictMap [(k, v)]

insert :: k -> v -> StrictMap k v -> StrictMap k v
insert k v (StrictMap m) = k `seq` v `seq` StrictMap ((k, v) : m)

lookup :: Eq k => k -> StrictMap k v -> Maybe v
lookup _ (StrictMap []) = Nothing
lookup k (StrictMap ((k', v):rest))
  | k == k' = Just v
  | otherwise = lookup k (StrictMap rest)

You can straightforwardly derive a lazy-in-values map from it by just putting the values in an extra box:

data Lazy a = Lazy a

newtype LazyMap k v = LazyMap (StrictMap k (Lazy v))

But this approach is bad because you pay for the extra indirection with time, space and memory fragmentation. Can we do better? Yes, we can define our own version of seq that is more flexible:

class MySeq a where
  myseq :: a -> b -> b

newtype Strict a = Strict a
newtype Lazy a = Lazy a

instance MySeq (Strict a) where
  myseq = seq

instance MySeq (Lazy a) where
  myseq _ b = b

newtype MyStrictMap k v = MyStrictMap [(k, v)]

insert :: MySeq v => k -> v -> MyStrictMap k v -> MyStrictMap k v
insert k v (MyStrictMap m) = k `seq` v `myseq` MyStrictMap ((k, v) : m)

and then strict and lazy maps are just two versions of the same thing:

newtype LazyMap k v   = LazyMap   (MyStrictMap k (Lazy v))
newtype StrictMap k v = StrictMap (MyStrictMap k (Strict v))

Now, I think this approach is actually sufficient to remove the duplication between Data.Map.Lazy and Data.Map.Strict in many cases (though I am no expert with the internals, so please correct me if I am wrong). What type class seq would achieve is the ability to use bang patterns and write

insert :: (Seq k, Seq v) => k -> v -> MyStrictMap k v -> MyStrictMap k v
insert !k !v (MyStrictMap m) = MyStrictMap ((k, v) : m)

But I suppose a putative OverloadedBangPatterns extension would achieve a similar end.

What do you think? This is getting far from the topic so if you would like to continue discussing then perhaps start a discussion on the GHC proposals repo (since this repo doesn't have discussions enabled).

phadej commented 2 years ago

Can CLC act on this proposal. Either accept, reject or suggest a change?

Bodigrim commented 2 years ago

I'd vote yes for exporting build from GHC.List, but not from Data.List. This way we accomplish the goal (even while I personally believe that Safe Haskell is a mis-feature), but avoid potential name clashes and do not promote this internal utility to a blessed, user-facing API. @cgibbard @tomjaguarpaw @emilypi @chessai @cigsender non-binding opinions please?

tomjaguarpaw commented 2 years ago

I agree that exporting build from GHC.List is a good first step. It resolves the original problem that @phadej reported, because GHC.List is Trustworthy. It seems reasonable to defer the question of exporting it from Data.List to another discussion.

mixphix commented 2 years ago

I concur that GHC.List makes the most sense as the module from which build gets exported!

chessai commented 2 years ago

+1 to exporting from GHC.List

Bodigrim commented 2 years ago

Can CLC act on this proposal. Either accept, reject or suggest a change?

@phadej Given the feedback above, I suggest to modify the proposal to export build from GHC.List instead of Data.List. If you are happy to proceed, we can trigger a vote as soon as an updated MR is available.

phadej commented 2 years ago

I updated this PR and GHC MR to re-export build and augment from GHC.List.

tomjaguarpaw commented 2 years ago

I vote in favour of this MR.

Bodigrim commented 2 years ago

I'm +1 as well.

Dear CLC members, could you please vote on https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7321/diffs, re-exporting build and augment from GHC.List? @cigsender @emilypi @cgibbard @chessai

cgibbard commented 2 years ago

+1

mixphix commented 2 years ago

+1

emilypi commented 2 years ago

+1

Bodigrim commented 2 years ago

Thanks all, with 5 votes out of 6 in favor and none opposed, the proposal is approved.

chshersh commented 1 year ago

I'm trying to summarise the state of this proposal as part of my volunteering effort to track the progress of all approved CLC proposals.

Field Value
Authors @phadej
Status merged
base version 4.17.0.0
Merge Request (MR) https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7321
Blocked by nothing
CHANGELOG entry present
Migration guide not needed

Please, let me know if you find any mistakes 🙂