Add {-# WARNING #-} to Data.List.{head,tail}

Bodigrim commented 2 years ago

Haddocks for Data.List (technically, GHC.List) warn against head and tail on the ground of their partiality.

I propose to promote these warnings to the pragma level as per MR !9290:

{-# WARNING head "This is a partial function, it throws an error on empty lists. Use pattern matching or Data.List.uncons instead. Consider refactoring to use Data.List.NonEmpty." #-}
{-# WARNING tail "This is a partial function, it throws an error on empty lists. Replace it with drop 1, or use pattern matching or Data.List.uncons instead. Consider refactoring to use Data.List.NonEmpty." #-}

I do not propose any further steps such as deprecation or removal of these functions. This is deliberately as conservative as possible. See https://github.com/haskell/core-libraries-committee/issues/70 for a wider discussion of a wider proposal.

Why only head and tail? Because these are functions, for which the widest range of replacements exist, almost always allowing for a safe, concise and local fix (see examples below). E. g., for init / last there is currently no such replacement (one must push for addition of Data.List.unsnoc first), and things like !! and maximum are even worse.

Why {-# WARNING #-} and not {-# DEPRECATED #-}? Because deprecation implies a future removal, and ambitions of this proposal are much smaller. It's already enshrined in base that these functions deserve a warning, we just promote its visibility, which should be less controversial.

The impact of the change is that users of head and tail will receive a GHC warning message. This is not an error and does not prevent from compilation, thus is not a breaking change. Users are recommended to follow the suggestion, or disable -Wno-warnings-deprecations (which is a sensible thing to do, for example, in a test suite), but they are also free to do nothing at all. Old packages will continue to work.

To avoid any confusion, -Wno-warnings-deprecations suppresses {-# WARNING #-} and {-# DEPRECATED #-}, but not any other GHC warnings. Those, who enabled -Werror, can pass -Wwarn=warnings-deprecations to downgrade this particular group back from errors to warnings. GHCi users can put :set -Wno-warnings-deprecations into their .ghci config.

There is a concern that -Wno-warnings-deprecations disables all {-# WARNING #-} and {-# DEPRECATED #-}, whatever the source. However, current Haskell ecosystem rarely makes much use of them, so I believe it is still a palatable compromise between seeing no warnings and making no changes.

Hardcore fans of head and tail, who are not satisfied with disabling warnings, are welcome to create a local file or even release a package, providing, say, Data.List.Partial, containing original definitions of head and tail without {-# WARNING #-}. I'm however opposed to introducing such Data.List.Partial into base itself: we won't be able to root it out ever.

GHC proposals https://github.com/ghc-proposals/ghc-proposals/pull/454 and https://github.com/ghc-proposals/ghc-proposals/pull/541 propose extensions to GHC warnings mechanism. Unfortunately, neither of them is approved or has a committed implementor, and this status does not seem to change soon, so it would be wrong to speculate on their precise nature. If and when they become a part of GHC, one can indeed ask for a review of {-# WARNING #-} pragmas.

How would you rewrite instance MonadFix [] without head and tail?

instance MonadFix [] where
    mfix f = case fix (f . head) of
               []    -> []
               (x:_) -> x : mfix (tail . f)

I'd rewrite it this way:

instance MonadFix [] where
    mfix f = case fix (take 1 >=> f) of
               []    -> []
               (x:_) -> x : mfix (drop 1 . f)

How would you rewrite this snippet?

case product xs of
  1 -> foo
  n -> bar n (head xs)

Besides options in https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1243587955, one can do this:

case (xs, product xs) of
  ([], _)    -> foo
  (_, 1)     -> foo
  (x : _, n) -> bar n x

or (if you insist on exactly two clauses):

case xs of
  x : _ | n <- product xs, n /= 1 = bar n x
  _ -> foo

How would you rewrite this snippet?

head $ filter (`notElem` hashes) $ map showt [0::Int ..]

I'd use a proper library for infinite lists aka streams: Stream, streams or infinite-list. E. g., Stream provides total head :: Stream a -> a and filter :: Stream a -> Stream a, so the snippet can be rewritten in a total way as

import Data.Stream as S 
S.head $ S.filter (`notElem` hashes) $ S.map showt $ S.iterate (+1) (0 :: Int)

infinite-list can make it even neater offering (0...) syntax to replace [0..].

cdsmith commented 1 year ago

Part of the issue is the idea of a "warning" can signal different things to different people, while a "notice" has perhaps less connotation. For example if we have future plans for some function in base that is not yet deprecated, a "hey! there's future plans, we're telling you now" message is arguably not a warning -- its just a polite heads up.

Interesting thought, but as long as a "notice" continues to be printed every time you compile the code, it's effectively a warning by another name. It continues to have all the bad effects of warnings: making the build noisier, which makes it easier to miss important messages, and more likely that people will just stop paying attention to the build output entirely. What's needed is not a more polite word, but a way for someone to acknowledge and dismiss the communication. Adding {-# OPTIONS_GHC -Wno-... #-} is a perfectly adequate way to resolve some issues (though even then, I often wish such a thing wasn't module-wide), but it's just too big a hammer to have to disable all deprecations and warning pragmas in order to acknowledge a notice about partial functions. In this respect, I think the path @simonmar proposed is perfect.

googleson78 commented 1 year ago

I'd like to again point out that a proposal addressing both

custom warnings and the ability to disable specific custom warnings
increased granularity in regards to disabling custom warnings. I think this it's important to have this at te expression level at least, because in the same module you could have a "shut up I explicitly acknowledge it's right" use of head and an accidental one, that you don't get warned about

exists - https://github.com/ghc-proposals/ghc-proposals/pull/454.

michaelpj commented 1 year ago

Not to weigh in on the discussion about what to do about head/tail, but just to comment on partial functions: the use of partial functions in HLS (and its dependencies!) has been a huge pain. It leads to utterly undiagnosable crashes of the entire application for end-users. It's frankly embarassing to read the bug reports. I used to be relaxed about partial functions but this experience has radicalized me!

A specific warning class would be nice, though.

Ericson2314 commented 1 year ago

@gbaz writes

however, we can give a "default behavior" by calling e.g. maximum (defValue : xs) and in some cases that even reads conceptually cleaner than providing an explicit default handler.

I agree that does read cleaner, but I am hoping with Foldable1 we can do maximum1 (defValue :| xs) (name to be bikeshed), which is the best of both worlds!

Bodigrim commented 1 year ago

@Ericson2314 Data.Foldable1 already provides not only total maximum, but also a total head. Stay tuned for GHC 9.6 :)

Boarders commented 1 year ago

Having a good version of Foldable1 in base will be fantastic :) - I definitely think that has contributed to the misuse of partial functions.

goldfirere commented 1 year ago

This thread unfolded while I was at ICFP (and very distracted) and then sick with covid (and rather ill). I am now at home, mostly well, and catching up.

I will attempt to summarize what I see:

Of the posters on this thread, it seems more people dislike partial functions in base than like having them there -- but a significant minority still like them.
The nature of the proposed warnings do indeed have the flavor of "We don't like these functions, so you shouldn't use them". Critically the "we" and "you" are different people! This attitude can be problematic ("we know better than you, and actually we secretly think we know your preferences better than you do") or helpful ("we have experienced life with partial functions, and we hope to help you avoid the same fate"). Why are the "we" and "you" different people? Because it seems generally easy for a project / company to have a policy banning head and tail -- no proposal needed. So this is all about telling other people what to do.
There is worry that if we can't do this, we can't ever reduce the partiality of base.

Lost among all the discussion is the suggestion of creating a new type of partiality warning, and then using that. Going forward with such a plan sidesteps the problems here: those who want partial functions can have them, those that don't can enable a warning about them, and we can make progress on this issue. It does mean that we'd need a new feature in GHC before this problem can be solved, which is a downside. But the upside is a happier community (I posit). That sounds worthwhile.

tomjaguarpaw commented 1 year ago

Thanks @goldfirere. Hope you recover fully soon. A few questions.

The nature of the proposed warnings do indeed have the flavor of "We don't like these functions, so you shouldn't use them".

I'm sorry if it comes across that way. As a proponent of this proposal I wish they would have the flavour "We don't like these functions, so we shouldn't have to have them imported in every Haskell module we ever write[^1]". Do you have a suggestion for how proponents can explain things going forwards so the "flavour" is the (much more palatable) latter?

it seems generally easy for a project / company to have a policy banning head and tail

Yes, and it's also easy for a project / company to have a policy of using a custom prelude that provides head and tail without warning. How should the CLC decide which of these two easy solutions is the one that base encourages by default? I genuinely see few reasoning principles in play here. It sounds like you are advocating for the reasoning principle "preserve the status quo if change risks upsetting a portion of the community" here. Is that correct? Are there other reasoning principles the CLC ought to be using?

Lost among all the discussion is the suggestion of creating a new type of partiality warning

It wasn't lost to me. In fact I asked two important questions in response to it but I don't think they were answered. Perhaps my questions or the answers themselves got lost.

We have only so much labour available. Who is going to write this putative documentation or tooling? https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1245661365

It certainly seems plausible. What would be your time estimate for it to progress through the stages of someone to propose this GHC feature to ghc-proposals, refine it based on steering committee feedback, wait for a vote, and then wait for someone to implement it? https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250222652

[^1]: Barring NoImplicitPrelude and friends.

nomeata commented 1 year ago

It certainly seems plausible. What would be your time estimate for it to progress through the stages of someone to propose this GHC feature to ghc-proposals, refine it based on steering committee feedback, wait for a vote, and then wait for someone to implement it?

Rough personal estimate, but given that it has come up a few times, also as a solution for other things, and the design is hopefully not too tricky, I expect a proposal to pass. So maybe 6 month until it hits master, give or take? Certainly a small amount of time compared to how long we had head in Prelude.

tomjaguarpaw commented 1 year ago

Sounds great! In which case I suggest the following small change to this proposal:

Merge the warnings to head/tail as proposed here, but as soon as the functionality for fine-grained warnings hits GHC HEAD change the warnings on head/tail to fine-grained warnings that can be selectively disabled. If Joachim's prediction is correct then no one will ever see the coarse-grained warnings.

michaelpj commented 1 year ago

The nature of the proposed warnings do indeed have the flavor of "We don't like these functions, so you shouldn't use them".

I actually do have this position. It's slightly more nuanced: do whatever you want in your own application, but if you're putting a library on Hackage then I would really rather you didn't use partial functions. Or that there was some way for me to avoid libraries that do. Usage of partial functions affects everyone who uses a module, transitively. So I do genuinely want the ecosystem to change it's behaviour. I realise this is "telling other people what to do" and potentially unfriendly.

simonmar commented 1 year ago

Merge the warnings to head/tail as proposed here

Do we really want to do that given that those of us who use -Werror and have existing code that will break will have to use -Wno-warnings-deprecations? There are other options but they're all bad, I strongly suspect this is what we would probably end up doing.

(why are the other options all bad? Well, not using -Werror is much worse, using -Wwarn=warnings-deprecations is just an annoyance since people ignore messages that scroll past at build time (hence -Werror). Inlining head is a terrible idea, defining your own head is equally terrible, and changing code to avoid using head is not something the person upgrading GHC for the whole company really wants to do.)

simonmar commented 1 year ago

I actually do have this position. It's slightly more nuanced: do whatever you want in your own application, but if you're putting a library on Hackage then I would really rather you didn't use partial functions. Or that there was some way for me to avoid libraries that do. Usage of partial functions affects everyone who uses a module, transitively. So I do genuinely want the ecosystem to change it's behaviour. I realise this is "telling other people what to do" and potentially unfriendly.

But you don't presumably want to rule out packages that use recursion? What about vector indexing or div, or error itself? This is what I'm still perplexed about, we seem to be drawing a line somewhere to say that head is bad but some of the other kinds of partiality are not so bad. Can we be clear about where this line is drawn and the rationale for it?

tomjaguarpaw commented 1 year ago

Merge the warnings to head/tail as proposed here

Do we really want to do that given that those of us who use -Werror and have existing code that will break will have to use -Wno-warnings-deprecations?

The point of my message is that, if there is a volunteer who can implement fine-grained warnings in GHC quickly enough, you won't have to use -Wno-warnings-deprecations. Fine-grained warning support will land in the same release of GHC as the warnings for head/tail, so they can be fine-grained.

But perhaps @simonmar you have a different estimate than @nomeata on how quickly implementing fine-grained warnings in GHC would take. In which case, would you mind answering the question I put to you in https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250222652:

What would be your time estimate for it to progress through the stages of someone to propose this GHC feature to ghc-proposals, refine it based on steering committee feedback, wait for a vote, and then wait for someone to implement it?

Can we be clear about where this line is drawn and the rationale for it?

I made a stab it https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250319578. Could you please let me know where it is lacking so I can try to refine it?

simonmar commented 1 year ago

I made a stab it https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250319578. Could you please let me know where it is lacking so I can try to refine it?

Ok sure. I did read that comment and I was still perplexed. There are a bunch of easy ones (head, maximum etc.) but I deliberately pointed out the tricky ones. How do you do division safely without div? Well you have to introduce a non-zero numeric type or something. How do you do safe vector indexing? You need to use a length-parameterised vector type, and if you've read some ICFP papers over the last few years you know how difficult that is! And how do you do recursion? Being explicit about recursion might indeed be a good idea, but you said that you actually want to avoid partiality in your own code and the code that you depend on, so being explicit about recursion doesn't get you closer to that goal.

And you said that you're not against error. But code that uses error is partial, so why is that OK?

ocharles commented 1 year ago

And you said that you're not against error. But code that uses error is partial, so why is that OK?

If I were to guess it's because the real problem here is the uselessness of the error message that comes from head. error itself doesn't have that problem, because it entirely depends on what you do with it. But I may be putting words into other's mouths, so best see what they have to say!

josephcsible commented 1 year ago

This is what I'm still perplexed about, we seem to be drawing a line somewhere to say that head is bad but some of the other kinds of partiality are not so bad. Can we be clear about where this line is drawn and the rationale for it?

I don't view this as drawing a line. I view it as starting with the worst and most easily replaceable partial functions. We'd only be drawing a line if we said these were the only partial functions that we ever wanted to deprecate.

And you said that you're not against error. But code that uses error is partial, so why is that OK?

Not the person you're replying to, but my opinion is that it's because you're explicitly asking for partiality rather than it being from a case that you might not have considered. It's the same reason that we don't want holes in the type system, but unsafeCoerce is fine to keep.

googleson78 commented 1 year ago

I also have a perspective to offer on "why the easy ones only" which I believe may be shared with some of the other participants.

The reason for "the easy ones first" for me is the exact same reason as to why we use types in the first place even though we (currently) can't rule out all errors using types alone - because they reduce errors none the less. If we can't even tackle the "easy" problems first, what chance do we stand of ruling out all problems?

Incidentally(?), taking the "all or nothing" road here also sounds very close to the rhetoric used by people who argue against types in the first place (along with some other arguments coming from that same place) -

"types can't rule out logic errors" ~ "removing head still leaves other partiality"
"I know this code is right" ~ "there's another check which makes head safe"
"I can write code faster without types" ~ "writing case instead of head is a big code modification"

tomjaguarpaw commented 1 year ago

I made a stab it https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250319578. Could you please let me know where it is lacking so I can try to refine it?

Ok sure. I did read that comment and I was still perplexed.

Could you elaborate on what you're perplexed by? It seems to me that the questions you ask in your recent comment were already answered in my original attempt but perhaps I'm misunderstanding.

There are a bunch of easy ones (head, maximum etc.)

Right, and in that comment I said I am against the easy ones (i.e. ones where partiality is (relatively) easy to avoid).

but I deliberately pointed out the tricky ones. How do you do division safely without div? Well you have to introduce a non-zero numeric type or something. How do you do safe vector indexing? ...

Indeed, and in that comment I said I'm not against those.

And how do you do recursion? Being explicit about recursion might indeed be a good idea, but you said that you actually want to avoid partiality in your own code and the code that you depend on, so being explicit about recursion doesn't get you closer to that goal.

I personally prefer to use total recursion combinators (map, foldl', tree walks for custom data types etc.) rather than write recursive functions directly, since the former are likely to lead to a mistake. But questions of recursion don't seem to impinge on the design of base. Could you clarify?

And you said that you're not against error. But code that uses error is partial, so why is that OK?

I don't draw the line at partial![^1] Beyond that, error is OK by me because it's essentially the only way that we can stub out impossible code paths. I would suggest it's only used if hitting it means that some internal invariant has been violated and all bets are off anyway.

In short, there is a cost-benefit analysis. When it comes to avoiding partial functions the benefit is avoiding the run time exception. The cost of avoiding partial functions is the cost of replacing them with something else. head/tail/maximum, for example, have easy total replacements. div///array indexing don't. That's why I draw the line between them, in my own code.

This is my own opinion, not CLC policy, but I hope it helps explain principles for where a line could be drawn.

@ocharles @josephcsible and @googleson78 have also interpreted my point of view fairly accurately.

[^1]: I had erroneously said I drew the line at partial but you rightly pointed that I don't. My later message was a clarification.

Bodigrim commented 1 year ago

I think @googleson78 really nailed it, thanks!

(why are the other options all bad? Well, not using -Werror is much worse, using -Wwarn=warnings-deprecations is just an annoyance since people ignore messages that scroll past at build time (hence -Werror). Inlining head is a terrible idea, defining your own head is equally terrible, and changing code to avoid using head is not something the person upgrading GHC for the whole company really wants to do.)

@simonmar You can enable -Wwarn=warnings-deprecations when upgrading GHC, then gradually fix warnings.

Could you elaborate with some data why other opportunities are terrible? We are yet to see a compelling example, where head is an indication of good coding style.

simonmar commented 1 year ago

I don't draw the line at partial!

OK great, my apologies. This mini-thread was in response to @michaelpj above https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1260739206 who said "I would really rather you didn't use partial functions", and I was trying to understand exactly which partial functions fall into that category.

I mean, I do understand the motivation here: we went through the whole avoid-partial-functions-like-the-plague journey in one project at work, where there were solid practical reasons for it. But on the other hand, Haskell is riddled with partiality so at best what we're doing is covering over a few of the more common pitfalls, and then it's a judgement call about which pitfalls you want to avoid and whether it's worth it.

I personally prefer to use total recursion combinators (map, foldl' ...

We should be careful with foldl'... or is it repeat that's at fault here? foldl' (+) 0 (repeat 1)

tomjaguarpaw commented 1 year ago

We should be careful with foldl'... or is it repeat that's at fault here?

It may or may not surprise you that I prefer to avoid using lazy lists, especially infinite ones.

simonmar commented 1 year ago

Could you elaborate with some data why other opportunities are terrible? We are yet to see a compelling example, where head is an indication of good coding style.

You can grep the codebase I currently work on if you like: https://github.com/facebookincubator/Glean

When I looked there were ~25 occurrences of head. I don't claim any of them represent "good programming style". In fact I fixed one the other day!

Some of them are just obviously correct and clear, though. e.g.

  return $ head $ filter (`notElem` hashes) $ map showt [0::Int ..]

michaelpj commented 1 year ago

I apologise, I came on too strongly. I realise that in my mind I had started to say "partial functions" when I meant something more like "egregiously partial functions". head, fromJust and friends annoy me because a) they're common and b) it's usually not that hard to avoid the partiality. Something like div is the opposite: it's not as common, and it's harder to avoid (but maybe if I wrote more numeric code I'd be more annoyed by errors from div!).

Generally I agree with @googleson78 : it's a matter of degree and cost-benefit tradeoff. I think head and a few others are comparatively cheap to avoid and seem to be disproportionately represented amongst the actual failures I've seen in the wild.

On the topic of error... I'd actually be fine with giving error the partiality warning. If I have to set -Wno-partial on the file where I use it, I'm okay with that, it forces me to be explicit that I'm doing something potentially sketchy, much like how I feel when I have to set -Wno-orphans or whatever.

ParetoOptimalDev commented 1 year ago

Some of them are just obviously correct and clear, though. e.g.
  return $ head $ filter (`notElem` hashes) $ map showt [0::Int ..]

To be honest, it took me longer than I'm comfortable admitting to understand why that was safe :sweat_smile:

Hm, it looks like Data.List.NonEmpty makes this difficult sadly because Data.List.NonEmpty.filter turns things back into a list.

Assuming we define NE.filter :: (a -> Bool) -> NE.NonEmpty a -> NE.NonEmpty a we could have:

return $ NE.head $ NE.filter' (`notElem` hashes) $ NE.fromList [0::Int ..]

I'm guessing you may not see an advantage to writing it that way since NE.fromList is still partial. I see a few advantages (ordered by importance):

By using NE.fromList we communicate this list should be NonEmpty
We segregate the partiality as far "inwards" as possible in a fashion remniscent of how many Haskellers prefer IO at the edges of their applications
NE.fromList uses WithCallStack

I think we could also likely implement enumFromTo and with even more effort the [0..] syntax sugar for Data.List.NonEmpty so that you could import Prelude hiding (head, filter) and just use the NonEmpty variants like you wrote the above .

Boarders commented 1 year ago

@ParetoOptimalDev That type signature of NE.filter looks awfully optimistic regarding the predicate passed.

alexfmpe commented 1 year ago

I'm guessing you may not see an advantage to writing it that way since NE.fromList is still partial.

There's 0 :| [1::Int ..] but the real problem is filter (const False) can't return NonEmpty, as stated above.

return $ head $ filter (notElem hashes) $ map showt [0::Int ..]

Assuming showt :: Show a => a -> Text, that looks odd complexity-wise. Seems like If the resulting number is n, then up to O(n * #hashes) comparisons are done. Why not store hashes as Set Int instead, then iterate on toList hashes until a free number is found, and finally update the Set? Shouldn't be more than O(#hashes) ? Though some interval tree would probably be much better (might not matter for a particular, fixed, use case).

IME, when I've felt the urge to reach for head and co, it's often because I'm not leveraging structure that would remove the need for a partial function, in which case I'm likely to be better served by modelling that structure and probably get other benefits besides safety.

josephcsible commented 1 year ago

We should be careful with foldl'... or is it repeat that's at fault here? foldl' (+) 0 (repeat 1)

I'd argue that the problem there isn't with either function, but rather with the type system itself, which doesn't have a way of distinguishing between inductive and coinductive data. And that's obviously way harder to fix than working towards getting rid of head.

simonmar commented 1 year ago

Assuming showt :: Show a => a -> Text, that looks odd complexity-wise.

Yes absolutely - the code can certainly be a lot more efficient. In the author's defense, this code isn't performance-sensitive, it comes from an interactive tool and length hashes is small. The real question is, could it be written in a way that is just as concise and clear but doesn't use head? Maybe it should use an explicit lazy infinite stream type, which could have a safe head defined for it. But you lose the ability to use built-in list syntax.

By the way, if you want another source of examples of uses of head, try the GHC source code. A quick grep just now turned up a lot of occurrences, picking one that looked interesting:

            iterateUntilUnchanged f eq a b
                = head $
                  concatMap tail $
                  groupBy (\(a1, _) (a2, _) -> eq a1 a2) $
                  iterate (\(a, _) -> f a b) $
                  (a, panic "RegLiveness.livenessSCCs")

which is another example of head-of-lazy-list-of-results. And just for fun it uses tail too (but the tail is trivially fixable by using NonEmpty.groupBy).

tomjaguarpaw commented 1 year ago

You can grep the codebase I currently work on if you like: https://github.com/facebookincubator/Glean

Thanks! That's very helpful for a case study. To put my code where my mouth is, I produced some example changes that could be applied to make the code "head free"[^1]. They fall under the following classes:

Data.List.NonEmpty.group/groupBy could be used to guarantee non-emptiness
splitOnNE :: Text -> Text -> NE.NonEmpty Text (which doesn't exist in text but should) could be used to guarantee non-emptiness
There's a list pattern match anyway, so it's easy to avoid head
I don't know why the invariant should hold, so I couldn't do anything better than pattern match and throw an error on [] (or use listToMaybe, or Data.List.NonEmpty.nonEmpty, or ...). Perhaps someone who knows the invariants better could enforce them in the type system elsewhere.

I guess these changes took at most one minute each. I haven't type checked, compiled or tested them.

I found the uses of head with the following hacky regex. Perhaps I missed some.

git grep "[( ]head " $(find . -iname \*.hs)

[^1]: I also got rid of one last and one tail because they were easy under the circumstances.

simonmar commented 1 year ago

Thanks! That's very helpful for a case study. To put my code where my mouth is, I produced some example changes that could be applied to make the code "head free"1. They fall under the following classes:

* `Data.List.NonEmpty.group/groupBy` could be used to guarantee non-emptiness

* `splitOnNE :: Text -> Text -> NE.NonEmpty Text` (which doesn't exist in `text` but should) could be used to guarantee non-emptiness

* There's a list pattern match anyway, so it's easy to avoid `head`

* I don't know why the invariant should hold, so I couldn't do anything better than pattern match and throw an `error` on `[]` (or use `listToMaybe`, or `Data.List.NonEmpty.nonEmpty`, or ...). Perhaps someone who knows the invariants better could enforce them in the type system elsewhere.

Great. So I did a rough count and there were about 8 cases that were fixed (no partiality remains) vs. 18 instances that were replaced with error.

I mean, sure it's easy to get rid of head if you just inline it. But I don't want to do that! The code is more verbose, and with head we already get an error message that points to the call site. All of these errors are of the form "some internal invariant has been violated", none of them are user-facing errors, so an error message that points to the call site is all you need. Inlining head just seems strictly worse to me.

ocharles commented 1 year ago

All of these errors are of the form "some internal invariant has been violated", none of them are user-facing errors, so an error message that points to the call site is all you need. Inlining head just seems strictly worse to me.

I guess the recurring question is why don't you want to truly encode these invariants? For example, I see Tom's fixes suggests changing a call to head occNameStr to instead match on occNameStr and have an explicit error case. The invariant here is that occNameStr is never empty - so why not state that? Why is occNameStr a [Char] in the first place? My personal take is whenever there is an invariant, it's worth a new type. I don't enforce that type is necessarily correct by construction (for example, I might have newtype OccNameString = OccNameString String), but that it only exposes an API that respects the invariants (so I could still have type_ :: OccNameString -> Char or something, even if I use [Char] under the hood).

I think this argument often gets shot down with "but then we need super dependant types to do it all!". I have rarely found that - you usually need super dependent types if you want to be extremely general. If you have specific invariants, a smart constructor and module exposing only a few select functions usually does the job.

At this point though this thread is mostly trending towards "well I would simply not write it that way", which is going to be quite personal and subjective. I'm not entirely sure you're going to reach an agreement.

simonmar commented 1 year ago

I guess the recurring question is why don't you want to truly encode these invariants?

Don't get me wrong, I've no objection at all to encoding these invariants. It's just not at the top of my todo list. (I'm not sure why adding to this thread happens to be at the top of my todo list, but I don't claim to prioritise rigorously!).

All I'm asking here is that there's a way to disable the warnings without collateral damage. -Wno-warnings-deprecations means I lose other warnings that I might be interested in. -Wwarn=warnings-deprecations isn't an option (non-fatal build-time warnings just aren't useful in large codebases with many developers, I can expand on why later if anyone is interested). Inlining head is worse in my opinion, also I don't really want to be forced to do that all over the place when I upgrade GHC, I just don't think the benefit justifies the cost here.

tomjaguarpaw commented 1 year ago

All I'm asking here is that there's a way to disable the warnings without collateral damage.

I'd like that too! In relation to that, could you please answer these two questions, which I have asked a few times. Sorry if I have missed the answer in the stream of comments but if so would you mind reanswering?

We have only so much labour available. Who is going to write this putative documentation or tooling? https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1245661365

It certainly seems plausible. What would be your time estimate for it to progress through the stages of someone to propose this GHC feature to ghc-proposals, refine it based on steering committee feedback, wait for a vote, and then wait for someone to implement it? https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250222652

tomjaguarpaw commented 1 year ago

.. and to perhaps be slightly more direct about it: if someone wants this proposal to be improved by the addition of fined-grained warnings to GHC can they please start the process of proposing that to GHC right now? Otherwise it sounds like saying "I'm in favour of this proposal ... but only in an imaginary world".

adamgundry commented 1 year ago

Perhaps it's been lost in all the noise, but there is already a proposal in this direction (https://github.com/ghc-proposals/ghc-proposals/pull/454) although it currently includes more type-level features than strictly needed here, and it might make sense to do something less ambitious (https://github.com/ghc-proposals/ghc-proposals/pull/454#issuecomment-1245044707). I'd appreciate feedback from interested parties on the proposal thread.

ParetoOptimalDev commented 1 year ago

@ParetoOptimalDev That type signature of NE.filter looks awfully optimistic regarding the predicate passed.

Do you mean where I left in (const True), an artifact of where I was testing the code compiled in ghci to ensure I didn't recommend something that couldn't work?

That was a simple mistake from a comment left with a headache right before bed.

It feels pretty bad faith to focus on that part specifically.

Boarders commented 1 year ago

@ParetoOptimalDev That type signature of NE.filter looks awfully optimistic regarding the predicate passed.

Do you mean where I left in (const True), an artifact of where I was testing the code compiled in ghci to ensure I didn't recommend something that couldn't work?

That was a simple mistake from a comment left with a headache right before bed.

It feels pretty bad faith to focus on that part specifically.

No, I mean there can be no such filter function for NonEmpty list since the predicate could be false for all inputs. A point I made earlier in the thread is that in order to use NonEmpty as a type-safe alternative to head one sometimes needs to convert from a predicate passed to filter or a predicate on a list to a constructive proof. As noted by @mixphix this typically leads to better code and can have performance benefits, but it is not a mechanical transformation and instead requires thought.

ParetoOptimalDev commented 1 year ago

There's 0 :| [1::Int ..] but the real problem is filter (const False) can't return NonEmpty, as stated above.

That's why I first wrote:

Assuming we define NE.filter :: (a -> Bool) -> NE.NonEmpty a -> NE.NonEmpty a we could have:

Or the full code:

-- foo.hs
import Prelude hiding (head, filter)
import qualified Data.List.NonEmpty as NE

-- NOTE: deliberately using unsafe `NE.fromList`, in practice we'd want to write it safely
filter' :: (a -> Bool) -> NE.NonEmpty a -> NE.NonEmpty a
filter' x y = NE.fromList $ NE.filter x y

main :: IO ()
main = print $ NE.head $ filter' (const True) $ NE.fromList [0::Int ..]

-- running it:
-- $ runhaskell foo.hs
-- 0

ParetoOptimalDev commented 1 year ago

This is no better than the use of head. NE.fromList is a partial function.

That's why I wrote the comment above of:

NOTE: deliberately using unsafe NE.fromList, in practice we'd want to write it safely

Boarders commented 1 year ago

If you want to add new partial functions to Data.List.NonEmpty you should make a separate proposal, I would be against it.

tomjaguarpaw commented 1 year ago

There have been a few accusations of bad faith, or similar, in this thread. Can I suggest that we all default to assuming the best of each other, that we are all collaborating to try to improve our language, that the debate is held with good intentions, and that perceived offence is more likely due to mismatched communication expectations than actual ill-will?

ParetoOptimalDev commented 1 year ago

If you want to add new partial functions to Data.List.NonEmpty you should make a separate proposal, I would be against it.

I do not want to add partial functions to Data.List.Nonempty.

ocharles commented 1 year ago

It looks to me like you're both talking past each other a bit. @ParetoOptimalDev I think what @Boarders is trying to tell you is that there is no way to write a filter :: (a -> Bool) -> NonEmpty a -> NonEmpty a that is not partial. filter (const False) for any list has to return an empty list, it can't return a NonEmpty a. Unless I'm misunderstanding you, you seem to be saying that such a function is possible - at least this is what I infer from " in practice we'd want to write it safely". If I'm wrong, perhaps the "it" here refers to something other than NonEmpty.filter?

simonmar commented 1 year ago

We have only so much labour available. Who is going to write this putative documentation or tooling? https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1245661365

I don't think that question was directed at me (or was it? Maybe I missed something). The comment you were replying to was https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1245607961 by @Boarders

Just in case you were asking me - no I'm not volunteering to write more docs! I'm here to explain why I don't think this proposal is a good idea to accept in its current form, and to suggest a way that would achieve the goals without the downsides, that's all.

It certainly seems plausible. What would be your time estimate for it to progress through the stages of someone to propose this GHC feature to ghc-proposals, refine it based on steering committee feedback, wait for a vote, and then wait for someone to implement it? https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1250222652

My apologies, I thought @nomeata had answered this well enough (https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1260518496). About 6 months? If it helps, I volunteer to shepherd the proposal through the committee, I don't imagine it would be very controversial.

There have been a few accusations of bad faith, or similar, in this thread

If that was me, I'm sorry. No bad faith intended. Perhaps some too-hasty reading and typing.

ParetoOptimalDev commented 1 year ago

Yes, I think you are right. It does seem that we can't safely create NonEmpty.filter :: (a -> Bool) -> NE.NonEmpty a -> NE.NonEmpty a.

Maybe we could have something like filter that always returns at least the initial NonEmpty list.

Or, this approach might not be best and there's a better way to rewrite the example given without using head that isn't too inconvenient:

return $ head $ filter (`notElem` hashes) $ map showt [0::Int ..]

ocharles commented 1 year ago

Or, this approach might not be best and there's a better way to rewrite the https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1261427242 without using head that isn't too inconvenient:

The "correct" type here is an infinite non-empty stream:

return $ S.head $ S.filter (`noElem` hashes) $ S.map showT $ S.enumFrom (0 :: Int)

@simonmar objected to this earlier because it gives up list syntax.

tomjaguarpaw commented 1 year ago

My apologies, I thought @nomeata had answered this well enough (https://github.com/haskell/core-libraries-committee/issues/87#issuecomment-1260518496). About 6 months? If it helps, I volunteer to shepherd the proposal through the committee, I don't imagine it would be very controversial.

Super, thanks!

If that was me, I'm sorry. No bad faith intended. Perhaps some too-hasty reading and typing.

No need to worry, it's only happened a small number of times, and none of them were from you! Thanks to everyone for continuing to debate in a constructive manner :)

simonmar commented 1 year ago

return $ S.head $ S.filter (noElem hashes) $ S.map showT $ S.enumFrom (0 :: Int)

@simonmar objected to this earlier because it gives up list syntax.

It's a weak objection; this version is OK.

ParetoOptimalDev commented 1 year ago

There have been a few accusations of bad faith,

I was trying to avoid accusation of bad faith by saying:

feels like bad faith

Rather than:

is bad faith

My goal included:

assuming the best of each other

But it felt important to communicate it felt that wasn't being good faith assumption wasn't being reciprocated without an explicit accusation of bad faith.

haskell / core-libraries-committee

Add {-# WARNING #-} to Data.List.{head,tail} #87