haskell / core-libraries-committee

95 stars 16 forks source link

Proposal: Remove method (/=) from class Eq #3

Closed nomeata closed 3 years ago

nomeata commented 3 years ago

@Bodigrim suggested I take the proposal that I recently submitted to libraries@ here, to play guinea pig for this process.

What

I suggest to change

class Eq a where (==), (/=) :: a -> a -> Bool

to

class Eq a where (==) :: a -> a -> Bool

and turn (/=) into a normal function.

Why

Why not (and rebuttals)

Most of these are taken from the libraries thread; rebuttals are mine.

djduke commented 3 years ago

I'm -1 on this change. I've >10 years experience teaching Haskell and its never been an issue with teaching. On the contrary Eq is a simpler case studies to introduce laws, rather than wait until students encounter Applicative etc I've sometimes found it simpler to think terms of dfining /=, I've never worried about performance from this issue, ther eare usually lower hanging fruits elsewhere.

isovector commented 3 years ago

I'm +1. This is a fantastic simplifying change. There's no reason not to do this, other than extremely trivial breakages. And I would really like to live in a world where we don't let the burden of already-written code get in the way of turning Haskell into the language we want it to be.

cartazio commented 3 years ago

this breaks code for no reason. Lets not touch this till we have better support for warning about redundantly defined type class methods such as having a "XOR" warning minimal pragma change such as I suggest in https://mail.haskell.org/pipermail/libraries/2021-October/031490.html

mixphix commented 3 years ago

I'm +1 on this. x /= y = not (x == y) can and should be defined outside of the class.

As it stands, Haskell's type class system does not have a framework for "class laws": at best, they are guidelines for how types fitting that class should behave. There is no way within a class' definition to guarantee that only those types with lawful implementations should be bestowed the class. The burden lies on the library designer to help guide their users away from footguns; unlawful instances are some of those. By removing (/=) from the class definition and forcing users only to write one method, the opportunity for bugs to arise diminishes and they are easier to spot when they do.

Also, who really writes Eq instances by hand in production code, anyway? Be honest. The vast, vast majority of users derive Eq for the datatypes they define. And by making Eq a single-method typeclass, this deriving gets even easier. You still get (/=) for free, it just so happens it's not in the class definition.

Thanks for raising this issue, @nomeata!

Bodigrim commented 3 years ago

To facilitate internal CLC discussions, I’d like to gather some additional input:

  1. If we were to design Haskell from scratch today, would we prefer one or two members of Eq? Are there arguments to prefer two today?
  2. Does this change affect educational materials or text books? Could anyone bring specific examples?
  3. Are proponents of the change willing to facilitate ecosystem’s migration by raising PRs to affected libraries? Are maintainers willing to accept such PRs? CC @phadej for postgresql-simple, @paul-rouse for mysql-simple, @treeowl for containers.
  4. What is an impact of this change on GHC and compilation times? CC @AndreasPK.
nomeata commented 3 years ago

Does this change affect educational materials or text books? Could anyone bring specific examples?

Concrete example: In a tutorial of my own I explain the concept of default methods when I introduce the Eq class. This section would change, and default implementations only explained later when they first come up. (Unsurprisingly,) I would welcome that change; it would allow me introduce default methods with a type class where they make more sense (e.g. Ord).

phadej commented 3 years ago

1 If we were to design Haskell from scratch today, would we prefer one or two members of Eq? Are there arguments to prefer two today?

I'd probably flip a coin

2 Does this change affect educational materials or text books? Could anyone bring specific examples?

Hutton's Programmin gin Haskell says in 3.9 basic classes section

Eq - equality types

... using following two methods

(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool

However, the presentation in that chapter just tells these functions exist, it doesn't yet speak about defining own instances.

I don't think that would trap future readers of an old edition book who use newer GHC

3 Are proponents of the change willing to facilitate ecosystem’s migration by raising PRs to affected libraries?

See below. Though I'm not proponent nor opponent of this change, I just want to dismiss the narrative that a lot stuff would break. I didn't believe it. Judge yourself.

Are maintainers willing to accept such PRs?

I'd accept a PR, though I'd just do it myself during the usual new-GHC-support work. (In my case, to have a comparison example, removal of Option generates/d more work).


I built Stackage LTS-18.8 set (which I had around for other experiment). Out of ~2600 packages (snapshot is slightly larger, but I left out some packages needing native dependencies etc.) the following ~45 needed some changes:

packages: attoparsec-0.13.2.5/
packages: backprop-0.2.6.4/
packages: basement-0.0.12/
packages: basic-prelude-0.7.0/
packages: butcher-1.3.3.2/
packages: clash-lib-1.4.3/
packages: clash-prelude-1.4.3/
packages: clientsession-0.9.1.2/
packages: compdata-0.12.1/
packages: compensated-0.8.3/
packages: conduit-1.3.4.1/
packages: DBFunctor-0.1.2.1/
packages: dimensional-1.4/
packages: eventstore-1.4.1/
packages: fb-2.1.1/
packages: geojson-4.0.2/
packages: ghc-lib-parser-8.10.7.20210828/
packages: haskell-gi-base-0.25.0/
packages: HaskellNet-0.6/
packages: hOpenPGP-2.9.5/
packages: hybrid-vectors-0.2.2/
packages: intro-0.9.0.0/
packages: io-streams-1.5.2.1/
packages: loc-0.1.3.10/
packages: mixed-types-num-0.5.9.1/
packages: multistate-0.8.0.3/
packages: numbers-3000.2.0.2/
packages: numeric-prelude-0.4.3.3/
packages: postgresql-libpq-0.9.4.3/
packages: postgresql-simple-0.6.4/
packages: prelude-compat-0.0.0.2/
packages: primitive-addr-0.1.0.2/
packages: protolude-0.3.0/
packages: relude-0.7.0.0/
packages: sbv-8.15/
packages: semirings-0.6/
packages: singletons-2.7/
packages: snap-core-1.0.4.2/
packages: sqlite-simple-0.4.18.0/
packages: string-class-0.1.7.0/
packages: tfp-1.0.2/
packages: type-level-natural-number-2.0/
packages: type-natural-1.1.0.0/
packages: unboxing-vector-0.2.0.0/
packages: uncertain-0.3.1.0/
packages: unification-fd-0.11.1/
packages: vector-0.12.3.0/

(and mysql-simple which I didn't build).

Most changes are oneliners. Either removing explicit /= definition, or adding /= import.

I can upload patches somewhere, if anyone is interested to take a closer look at them. (I won't submit them upstream, though some are probably good even without the change going in, i.e. removing unnecessary definitions of /=).

Few packages are more interesting:


clash-prelude

This is interesting application. I'd ask if the change makes things less convenient.

The /= is defined to be some "primitive" (which is NOINLINE, and in future OPAQUE I suppose) for which Clash looks in the Core. That's my understanding.


alternative or special purpose preludes:

protolude, basic-prelude, haskell-gi-base, relude needed an amendment to keep the re-export list intact. Similarly basement, though it also needed a fix in itself.


GHC (like) code: clash-lib and ghc-lib-parser, "I have seen this code".


complicated /= example

https://hackage.haskell.org/package/numbers package illustrates the "are == and /= compatible, or do they differ?" problem.

data Interval a = I a a

ival :: (Ord a) => a -> a -> Interval a
ival l h | l <= h = I l h
         | otherwise = error "Interval.ival: low > high"

instance (Ord a) => Eq (Interval a) where
    I l h == I l' h'  =  l == h' && h == l'
    I l h /= I l' h'  =  h < l' || h' < l

https://hackage.haskell.org/package/DBFunctor package has:

-- | Definition of the Relational Data Type. This is the data type of the values stored in each 'RTable'.
-- This is a strict data type, meaning whenever we evaluate a value of type 'RDataType', 
-- there must be also evaluated all the fields it contains.
data RDataType = 
        RInt { rint :: !Integer }
      | RText { rtext :: !T.Text }
      | RUTCTime { rutct :: !UTCTime}
      | RDate { 
                rdate :: !T.Text
               ,dtformat :: !Text  -- ^ e.g., "DD\/MM\/YYYY"
            }
      | RTime { rtime :: !RTimestamp  }
      | RDouble { rdouble :: !Double }
    -- RFloat  { rfloat :: Float }
      | Null

-- | We need to explicitly specify equation of RDataType due to SQL NULL logic (i.e., anything compared to NULL returns false):
-- @
-- Null == _ = False,
-- _ == Null = False,
-- Null /= _ = False,
-- _ /= Null = False.
-- @
-- IMPORTANT NOTE:
-- Of course this means that anywhere in your code where you have something like this: 
-- @
-- x == Null or x /= Null, 
-- @
-- will always return False and thus it is futile to do this comparison. 
-- You have to use the is 'isNull' function instead.
--
instance Eq RDataType where

    RInt i1 == RInt i2 = i1 == i2
    -- RInt i == _ = False
    RText t1 == RText t2 = t1 == t2
    -- RText t1 == _ = False
    RDate t1 s1 == RDate t2 s2 = toRTimestamp (unpack s1) (unpack t1) == toRTimestamp (unpack s2) (unpack t2)   -- (t1 == t1) && (s1 == s2)
    -- RDate t1 s1 == _ = False
    RTime t1 == RTime t2 = t1 == t2
    -- RTime t1 == _ = False
    RDouble d1 == RDouble d2 = d1 == d2
    -- RDouble d1 == _ = False
    -- Watch out: NULL logic (anything compared to NULL returns false)
    Null == Null = False
    _ == Null = False
    Null == _ = False
    -- anything else is just False
    _ == _ = False

    Null /= Null = False
    _ /= Null = False
    Null /= _ = False
    x /= y = not (x == y) 

I.e. it's a DSL trying to emulate SQL-like processing in Haskell. Not able to do weird thing SQL does with NULl equality would not be nice. (a lot more inconvenient then for postgresql-simple, where I actually have no idea what Eq Null instance is useful for).

However, when removing the /= none of the tests of DBFunctor failed.


paul-rouse commented 3 years ago

It would be good to get away from the blatantly unlawful instance in mysql-simple! This predates my time as maintainer, so I can only assume the idea was along the lines of the comment in DBFunctor mentioned above. However, it is a poor reflection of modern SQL semantics, and could be very confusing. I would probably just remove the Eq instance on Null, but it is, unfortunately, exposed to users of the library. I doubt if anybody actually uses it, and I've checked that persistent-mysql doesn't, but it's possible that something somewhere would break (I haven't yet tried compiling all of stackage against this change).

Exactly the same code appears in postgresql-simple, so I assume the same applies there.

phadej commented 3 years ago

However, it is a poor reflection of modern SQL semantics

The poorness there is try to model 3VL with two-valued Bool. But I think 3VL way of presenting the NULL comparisons is not particularly new.

https://en.wikipedia.org/wiki/Null_(SQL)#Comparisons_with_NULL_and_the_three-valued_logic_(3VL)

AndreasPK commented 3 years ago

In terms of performance I would expect for real world code:

So from a performance PoV, if we were to design it today there would be a reason to have it be a single method. But it's unclear if it's worth breaking packages now for the small benefit it provides.

Am 27/10/2021 um 19:45 schrieb Bodigrim:

To facilitate internal CLC discussions, I’d like to gather some additional input:

  1. If we were to design Haskell from scratch today, would we prefer one or two members of |Eq|? Are there arguments to prefer two today?
  2. Does this change affect educational materials or text books? Could anyone bring specific examples?
  3. Are proponents of the change willing to facilitate ecosystem’s migration by raising PRs to affected libraries? Are maintainers willing to accept such PRs? CC @phadej https://github.com/phadej for |postgresql-simple|, @paul-rouse https://github.com/paul-rouse for |mysql-simple|, @treeowl https://github.com/treeowl for |containers|.
  4. What is an impact of this change on GHC and compilation times? CC @AndreasPK https://github.com/AndreasPK.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/core-libraries-committee/issues/3#issuecomment-953165344, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJ65ZWUO6ZWCLDJSM4IMD3UJBCGJANCNFSM5GZ2YRQQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Bodigrim commented 3 years ago

CC @christiaanb for clash-prelude, @nkarag for DBFunctor.

Icelandjack commented 3 years ago

Using type class backends we could transparently evolve Eq to BackendEq with only one method

phadej commented 3 years ago

Using type class backends we could transparently evolve Eq to BackendEq with only one method

Something like Eq should be a simple as can be imaginable. Original motivation says:

Teaching: The Eq class is often the first, or one of the first, type classes that beginners will face.

gwils commented 3 years ago

I find it hard to form a particularly strong opinion on this either way. It seems like a fairly minor breakage for very minor wins. Some thoughts: I don't weight teaching very heavily in decision-making about Prelude. I think most educational material would be better served defining a custom prelude, with less polymorphic types, fewer and simpler classes, etc. I'm also perfectly happy to tell students useful lies, for example, I could just present Eq as having one method, and students writing their own instances would be able to do so happily without knowing (/=) is a class method, or even exists! I'm somewhat sympathetic to the argument that a single method makes law-breaking instances harder, but Eq is rather weak in its laws; in deference to the practically defunct Haskell Report, the Eq laws are explicitly merely "encouraged" (if you'll allow me to side-step Melanie's point that all laws in Haskell are effectively suggestions anyway). I would be more sympathetic to this change if we took a firmer stance on Eqs lawfulness. Inequality checks are available as an instruction for some types, aren't they? Wouldn't that mean some types will have worse performance with this change (an unnecessary function call to not vs. a single instruction). I confess I'm not very knowledgeable at that level of abstraction.

Right now I'm a very weak -1 on this.

int-index commented 3 years ago

As a Haskell user I probably wouldn’t notice this (I always derive Eq), but back when I was learning Haskell I thought it was pretty neat that the language doesn’t take a stance whether == or /= is the fundamental operation. It’s a nice way to introduce the notion of minimal complete definition (i.e. define a method of your choice and get the other one for free).

Really, why is your proposal to remove /= and keep == rather than to remove == and keep /=? What makes one more fundamental than the other? I used to find it quite elegant that we don’t have to make these arbitrary choices (nowadays I don’t care).

To summarise, I’m leaning slightly against this proposal, but wouldn’t lose my sleep over it.

ksqsf commented 3 years ago

Does the same reasoning apply to (<$)? It is currently defined in the Functor class. If we remove (/=) from Eq, should we remove (<$) from Functor too?

gbaz commented 3 years ago

Nope. As the docs to that say: " The default definition is fmap . const, but this may be overridden with a more efficient version." The argument for removing /= is that because negation of booleans is cheap, there is no more efficient version.

Bodigrim commented 3 years ago

When we add a type class member, which can be expressed via others, there is only one possible excuse to do so: one can provide a more efficient implementation than default. Since this is not the case for (/=), its presence in Eq just opens a window for incoherence without any benefit. Neither (==) nor (/=) is more fundamental than another, but having them both is asking for troubles.

Given that the migration path is straightforward and backward compatible and that maintainers of SQL libraries above are on board with cleaning infelicities in instance Eq Null, I'm +1 on this change.

phadej commented 3 years ago

In some settings /= is a primitive operation and == is derived from it.

This is speculation. @nomeata's test indicate that it's not an issue. (Compiling Cabal, see the mailing post https://mail.haskell.org/pipermail/libraries/2021-October/031484.html). @nomeata have you already run nofib too?

tomjaguarpaw commented 3 years ago

Many people with good judgement believe there is some merit to this proposal. I confess I cannot see what they see. I am very much against churn for the sake of minor improvement. The Haskell community already has a reputation of insufficiently valuing stability.

On the other hand, perhaps this change will have minimal impact. Indeed the only way I could countenance it is if there is no impact on maintainers. That seems to imply that someone must take responsibility for ensuring that the entirety of Hackage builds with this change before the change is actually applied. If that happens then I won't oppose the proposal (not that my voice carries any weight) but I still can't understand the motivation.

nomeata commented 3 years ago

@nomeata have you already run nofib too?

No, I did not. I shy away from performance measurements because it's so hard to get them meaningfully right, at least for runtime changes in sub percentage space.

phadej commented 3 years ago

@nomeata check nofib perf options. These are good as they don't count wall clock:

Usage: nofib-run [-c|--clean] [TEST] [-j|--threads N] [-w|--compiler HC] 
                 [--compiler-arg ARG] (-o|--output DIR) [--cachegrind] 
                 [--cachegrind-arg ARG] [--perf] [[--perf-arg ARG]] 
                 [-s|--speed ARG] [--rts-arg ARG] [-t|--times ARG] 
                 [--keep-going] [--head] [-V|-v|--verbosity ARG]
  nofib Benchmark Suite

Available options:
  -h,--help                Show this help text
...

  --perf                   Run the tests under `perf stat`
santiweight commented 3 years ago

I've seen a number of proposals on and off like this recently. I think the reason is likely that there's a an energy to "fixing" Haskell, which I think is a great symptom and energy in the community. However, I advocate against this change, not because the change is bad in isolation, but because accepting this proposal would be symptomatic of a poor and inconsiderate design process, and a lack of unifying philosophy in the CLC.

My understanding is that accepting such a proposal would mean that the CLC is saying "we will accept any improvement to the Haskell core libraries that has a minor cost and is an improvement on the past situation."

For example, we can take a similar proposal here from @isovector which aimed to export liftA2 and join from Prelude which is a seemingly innocent change, just like this proposal from @nomeata. That proposal seems to satisfy a similar place to this Eq proposal, namely it is small and not-too-breaking.

But let's assume we accept this change. Now I come along and look at Ord, and I say "huh", why include anything other than <=? I won't argue for or against the content of such a proposal... But note now that we have 3 breaking changes, each come up with "on a whim" (which is not bad), and 3 breaking changes to Prelude that seem:

  1. Unaware of each other
  2. Unclear whether they actually address the principle they attempt to uphold (for example this Eq proposal which attempts to prevent unlawful Eq does not do so for `Ord... Why is that?)
  3. Minor changes that alone are not a big deal but start to add up... I could probably find 10 such proposals if I tried...
  4. Potentially happening across different base releases

Note also that the change to Eq might cause a worse situation... What if the breakage caused by the change to Ord is actually too substantial (I personally once had a unlawful Ord instance) and we decide that that typeclass cannot be broken. Now we have one typeclass that is "right" and one typeclass that is "wrong". That situation is worse than today! Next year we'll have someone propose including /= in Eq ;)

The point being, we shouldn't accept that these proposals would each require more files to have CPP code in them without really auditing base and doing such a change in one fell swoop. Plus we should really consider developing a migration tool for Haskell (and non-text-based conditional compilation), because these problems will keep coming up and library maintainers will eventually be unable to use linters if we don't give them the attention they deserve.

In sum, such a change cannot and should not be an isolated change. The real proposal as I see it is "Remove default typeclass methods from typeclasses where their implementation can upset the class's laws". That is a very reasonable proposal, whose actual costs we can discuss fully by inspecting all unlawful classes and a subsequent migration strategy. That way, we can also talk about three or four "principles" being fixed in base in a major version instead of a hodge-podge of unrelated, but each seemingly reasonable, whack-a-moles updates!

isovector commented 3 years ago

@santiweight's comment is in line with my recent thoughts here. There are lots of really good changes we should make to base, all of which keep getting shot down for exactly the same reason of "too much churn." But that doesn't mean we shouldn't do them! It just means we should pick some particular time/release and put all of our big breaking changes into that basket.

ofc a better solution would be to unprivilege base so that it's unrelated to GHC versions, and then just use semver for it like we do with literally every other package.

phadej commented 3 years ago

In sum, such a change cannot and should be an isolated change. The real proposal as I see it is "Remove default typeclass methods from typeclasses where their implementation can upset the class's laws".

But *it is not**. The proposal says

(The Ord class is much better in that regard, for example (<=) can be implemented more efficiently than compare.)

And as other have pointed out, it's difficult to figure out where /= is more efficient then ==.

The examples brought by @effectfully are all with a disclaimer: "(I don't really use the Eq class there)".

phadej commented 3 years ago

and then just use semver for it like we do with literally every other package.

PVP is semantic versioning scheme. There are problems:

Add a Typeable constraint to fromStaticPtr in the class GHC.StaticPtr.IsStatic.

Otherwise I agree, base which is decoupled from GHC release cycle would be great, but the infrastructure is not there yet. It cannot be done any time soon.

It would greatly reduce the compatibility burden. it would be enough to make a version of base to be compatible with 3 latest GHCs, and not worry about that the downstream code can be written against three different bases.

But that all just wishes. If you don't like the churn, please push for solving the underlying problems, not blocking changes in the mean time.

santiweight commented 3 years ago

@phadej I don't I made my point totally clear in that regard, the point being that the removing <= or compare would each have their arguments, and the fact that the removal of one or the other could be a reasonable proposal means that we will always be catching up (addressing new proposals etc.) if we don't initially decide on what principle guides the design of such proposals, so we can easily say "this proposal fails on this principle" for example. As it stands, each time a proposal comes out we start from 0 and the discussion is repeating itself.

Ericson2314 commented 3 years ago

I would like some large scale coordination on how to make decoupled base a reality. Otherwise we get these massive threads with no easy answers and the CLC can only easily do things at the margins.

I think we can do it today with CPP, and then we will want to invest in things like better orphans and better backpack to make the interfaces cleaner.

santiweight commented 3 years ago

But that all just wishes. If you don't like the churn, please push for solving the underlying problems, not blocking changes in the mean time.

I am pushing for the underlying problems! I am currently trying to come up with a non-text-based pre-processor/conditional-compilation extension that would allow a migration tool to be written for changes to namespace etc. I posted in the slack channel yesterday in fact and (once some current personal life stuff is done) I will try to get some implementation over Christmas. But the point also stands that this proposal's acceptance is a worrying and a _realworld slippery slope (slippery slopes are valid arguments if they happen), and I am personally not trying to "block changes" but instead to protect users, which is, after-all the whole point of the core libraries. The core principle here is not "make states unrepresentable, but make users happy and productive - the second principle implies the first only some of the time, for the reasons outlined above and in the proposal

phadej commented 3 years ago

@santiweight

I don't I made my point totally clear in that regard, the point being that the removing <= or compare would each have their arguments,

No they won't. compare and <= has more efficient implementations. And GHC derives them:

{-# OPTIONS -ddump-deriv #-}
module MyMaybe where

data MyMaybe a  =  MyNothing | MyJust a
  deriving ( Eq, Ord )
==================== Derived instances ====================
Derived class instances:
  instance GHC.Classes.Eq a =>
           GHC.Classes.Eq (MyMaybe.MyMaybe a) where
    (GHC.Classes.==) (MyMaybe.MyNothing) (MyMaybe.MyNothing)
      = GHC.Types.True
    (GHC.Classes.==) (MyMaybe.MyJust a1_a2cD) (MyMaybe.MyJust b1_a2cE)
      = ((a1_a2cD GHC.Classes.== b1_a2cE))
    (GHC.Classes.==) _ _ = GHC.Types.False

  instance GHC.Classes.Ord a =>
           GHC.Classes.Ord (MyMaybe.MyMaybe a) where
    GHC.Classes.compare a_a2cF b_a2cG
      = case a_a2cF of
          MyMaybe.MyNothing
            -> case b_a2cG of
                 MyMaybe.MyNothing -> GHC.Types.EQ
                 _ -> GHC.Types.LT
          MyMaybe.MyJust a1_a2cH
            -> case b_a2cG of
                 MyMaybe.MyJust b1_a2cI -> (a1_a2cH `GHC.Classes.compare` b1_a2cI)
                 _ -> GHC.Types.GT
    (GHC.Classes.<) a_a2cJ b_a2cK
      = case a_a2cJ of
          MyMaybe.MyNothing
            -> case b_a2cK of
                 MyMaybe.MyNothing -> GHC.Types.False
                 _ -> GHC.Types.True
          MyMaybe.MyJust a1_a2cL
            -> case b_a2cK of
                 MyMaybe.MyJust b1_a2cM -> (a1_a2cL GHC.Classes.< b1_a2cM)
                 _ -> GHC.Types.False
    (GHC.Classes.<=) a_a2cN b_a2cO
      = GHC.Classes.not ((GHC.Classes.<) b_a2cO a_a2cN)
    (GHC.Classes.>) a_a2cP b_a2cQ = (GHC.Classes.<) b_a2cQ a_a2cP
    (GHC.Classes.>=) a_a2cR b_a2cS
      = GHC.Classes.not ((GHC.Classes.<) a_a2cR b_a2cS)

Note: there is no /= definition. Default is used. There is both compare and <

phadej commented 3 years ago

is plain wrong as in certain rare cases there's no sensible way of expressing /= via == at all, while it works perfectly well the other way around.

Please write down your example, so it's not a speculative "in some rare cases I almost run in private codebase". Provide a minimal example.

I don't understand your DSL-ish like example

If I was not to compile the expression right after turning a constructor representing == into a constructor representing /= and instead used an intermediate representation,

How you get there, == has to return Bool, not some AST node type?

effectfully commented 3 years ago

How you get there, == has to return Bool, not some AST node type?

Right, I got myself confused. Thank you for catching that. I dismiss my nonsensical points. I'll delete the noise.

santiweight commented 3 years ago

@phadej I appreciate the feedback. I don't want to deny the fact that clearly Ord is different than Eq as you presented. Thank you for explaining that to me

I don't want us to accidentally hijack the thread, but I can equally go ahead and bring up the point for other typeclasses. How about Enum - let's kill succ pred etc. How about Monad of no return - that is the same issue as this current proposal just that the cost is higher. Is the argument purely one of efficiency? Why then is even : Integral a => a -> Bool not defined in the type class, which has a faster implementation for binary representation than for other representations? Why is (*>) :: f a -> f b -> f b in Applicative instead of defined outside of the typeclass? I don't know that anyone has ever defined their own version of *>...

Anyway, the fact that we are having this conversation, and I am not being linked to some documentation saying "This is when unlawfullable typeclasses are allowed" is the evidence I need to suggest that we have an underlying problem that we will all continually waste time on if we can't cache all the points and criticisms in one place.

tomjaguarpaw commented 3 years ago

I don't know that anyone has ever defined their own version of *>...

This is becoming very tangential now, but they are treading on dangerous ground if they don't. See forever contains a space leak.

santiweight commented 3 years ago

@tomjaguarpaw I definitely agree, let's summarise my aspect of the feedback (besides thinking that the migration is more costly than I see acknowledged), as this:

Anyway, the fact that we are having this conversation, and I am not being linked to some documentation saying "This is when unlawfullable typeclasses are allowed" [is the real issue here]

I think the CLC could come up some such principles and guidelines so that this discussion can focus on a more narrow part of the massive design-space that is base.

phadej commented 3 years ago

about Enum - let's kill succ pred etc.

Enum is bad class, but

Prelude> succ (2 ^ 128 :: Integer)
340282366920938463463374607431768211457
Prelude> toEnum (fromEnum (2 ^ 128 :: Integer) + 1) :: Integer
1

How about Monad of no return - that is the same issue as this current proposal just that the cost is higher.

Almost, except return has to be pure. (And mappend has to be <>).

Why then is even : Integral a => a -> Bool not defined in the type class, which has a faster implementation for binary representation than for other representations?

It could. But there is Bits class though.

I don't know that anyone has ever defined their own version of *>...

https://hackage.haskell.org/package/tagged-1.8.6.1/docs/src/Data.Tagged.html#line-250

instance Applicative (Tagged s) where
    pure = Tagged
    {-# INLINE pure #-}
    Tagged f <*> Tagged x = Tagged (f x)
    {-# INLINE (<*>) #-}
    _ *> n = n
    {-# INLINE (*>) #-}
treeowl commented 3 years ago

@santiweight, Data.Sequence has custom implementations for both *> and <* that produce asymptotically more compact result structures than the default definitions.

As for Enum: it's a disaster of a class, and I for one would welcome a complete replacement.

tomjaguarpaw commented 3 years ago

@santiweight Ah, I think I understand. You are saying that the CLC should elaborate some general principles around the design of type classes in base. Then we should systematically move towards fulfilling those principles. This particular proposal would become merely an artefact of that process. I am inclined to agree. Rather than having a one-off discussion about whether removing (/=) pulls its weight, where analogies to (<=) are buried, let's hash out the general principles in one go. Sounds appealing to me!

santiweight commented 3 years ago

Thanks guys. Again the feedback is very appreciated and I promise very well-received. I'll just reiterate the point that I am pointing out that this "small" proposal should come with fixing all of these issues under a global base audit instead of the whack-a-mole.

I promise for example, that when Enum gets its own proposal, you will all (and some other newbie like myself) waste your time explaining the subtleties of each typeclass and the design decisions that are pretty obviously non-trivial. If people want to accept this proposal, I hesitate for the other reasons I described above regarding compatibility and frequency of changes.

But this conversation we just had will be had again if we don't address the larger issue of not having a centralised idea behind what the actual fuck base is meant to be! And what the hell amount of breakage is permitted and for how much gain! It's not clear to me at all!

@tomjaguarpaw That is exactly my point.

treeowl commented 3 years ago

Control.Monad.ST.Lazy also has a custom *> for performance reasons, as I recall—lots of things there can't inline because of delicate concurrency issues.

tomjaguarpaw commented 3 years ago

(Off-topic: @treeowl It's absolutely essential for almost every Applicative to have a specialised *> implementation, in order to avoid space leaks.)

santiweight commented 3 years ago

I've proposed a separate discussion so that we can remove this high-level discussion out of this precise and well-defined change. Sorry for the hijack @nomeata!

nomeata commented 3 years ago

The point being, we shouldn't accept that these proposals would each require more files to have CPP

This proposal will not require CPP: the changes this would force libraries to do are fully backwards compatible with previous versiond of base. And in fact, library authors could (should?) do them already now - just don't instantiate (/=) (and If you import it explicitly, do it on its own, not as part of Eq((/=))).

Thanks for asking for the underlying principle! That's a good point. I think it might be “don't have methods with default implementation if there are no good reasons to ever instantiate them explicitly, to reduce unnecessary cognitive load, i.e. don't even bother users of the type class with making that decision”

tomjaguarpaw commented 3 years ago

I've proposed a separate discussion so that we can remove this high-level discussion out of this precise and well-defined change

... and it can be found at https://github.com/haskell/core-libraries-committee/issues/12

Bodigrim commented 3 years ago

As a point of order, I would like to urge participants to discuss the proposal at hand, not other subjects, however appealing they are.

NorfairKing commented 3 years ago

:+1: if you can explain how maintainers will be helped to deal with the breakage. :-1: if it's just considered "their job".

phadej commented 3 years ago

I have run nofib perf tests using current master and Joachim's single-method-eq branch

master is vanilla master master2, single-eq, and single-eq2

are run with following patch

diff --git a/real/fulsom/Interval.hs b/real/fulsom/Interval.hs
index e0fa79c..94b32b2 100644
--- a/real/fulsom/Interval.hs
+++ b/real/fulsom/Interval.hs
@@ -33,7 +33,7 @@ a # b = a :#: b

 instance (Ord a, Eq a) => Eq (Interval a) where
   a == b        = a >= b && a <= b  -- Not correct - but it will do.
-  a /= b        = a >  b || a <  b
+--  a /= b        = a >  b || a <  b

 instance (Ord a) => Ord (Interval a) where
diff --git a/real/hidden/Numbers.hs b/real/hidden/Numbers.hs
index 9059a00..42ffa01 100644
--- a/real/hidden/Numbers.hs
+++ b/real/hidden/Numbers.hs
@@ -6,7 +6,7 @@ data Number = Tolerant Float
 -}
 instance Eq Number where
    Tolerant a == Tolerant b    = abs(a-b) < eps
-   Tolerant a /= Tolerant b    = abs(a-b) > eps
+-- Tolerant a /= Tolerant b    = abs(a-b) > eps
 instance Ord Number where
    Tolerant a <= Tolerant b    = a-eps < b
    Tolerant a < Tolerant b     = a < b-eps

Highlights:

|                               ||  |     master/ | std. err. | master2/ (rel) | std. err. | single-eq/ (rel) | std. err. | single-eq2/ (rel) | std. err. |
+===============================++==+=============+===========+================+===========+==================+===========+===================+===========+
|        imaginary/digits-of-e1 ||  |     4.898e9 |      0.0% |         +0.11% |      0.0% |          +12.93% |      0.0% |           +12.85% |      0.0% |
|                  real/eff/VSD ||  |     6.791e7 |      0.0% |        +24.00% |      0.0% |          +28.52% |      0.0% |           +24.00% |      0.0% |
|               smp/callback001 ||  |     7.371e9 |      0.0% |       +171.92% |      0.0% |          -77.81% |      0.0% |           -78.11% |      0.0% |
|               spectral/puzzle ||  |     6.600e9 |      0.0% |         -0.26% |      0.0% |          -13.37% |      0.0% |           -13.70% |      0.0% |
+===============================++==+=============+===========+================+===========+==================+===========+===================+===========+
|                     geom mean ||  |             |    +0.73% |         -0.50% |    -0.70% |                  |           |                   |           |

I haven't looked at these programs, whether there's something obvious why there is such increase or decrease in instruction count.

This Normal nofib run also suggests that whether Eq is a single method class or not is not performance critical decision.

Full report:

``` # perf instructions +-------------------------------++--+-------------+-----------+----------------+-----------+------------------+-----------+-------------------+-----------+ | || | master/ | std. err. | master2/ (rel) | std. err. | single-eq/ (rel) | std. err. | single-eq2/ (rel) | std. err. | +===============================++==+=============+===========+================+===========+==================+===========+===================+===========+ | imaginary/bernouilli || | 7.174e9 | 0.0% | +0.31% | 0.0% | +0.17% | 0.0% | +0.38% | 0.0% | | imaginary/digits-of-e1 || | 4.898e9 | 0.0% | +0.11% | 0.0% | +12.93% | 0.0% | +12.85% | 0.0% | | imaginary/digits-of-e2 || | 5.683e9 | 0.0% | -0.02% | 0.0% | +0.68% | 0.0% | +0.28% | 0.0% | | imaginary/exp3_8 || | 8.782e9 | 0.0% | -0.01% | 0.0% | +0.04% | 0.0% | -0.11% | 0.0% | | imaginary/gen_regexps || | 8.644e9 | 0.0% | +0.05% | 0.0% | -0.35% | 0.0% | -0.02% | 0.0% | | imaginary/integrate || | 6.050e9 | 0.0% | -0.31% | 0.0% | +7.08% | 0.0% | +7.03% | 0.0% | | imaginary/kahan || | 6.697e9 | 0.0% | +0.15% | 0.0% | +0.36% | 0.0% | +0.23% | 0.0% | | imaginary/paraffins || | 8.110e9 | 0.0% | -0.44% | 0.0% | -0.27% | 0.0% | -0.09% | 0.0% | | imaginary/primes || | 8.238e9 | 0.0% | +0.28% | 0.0% | +0.41% | 0.0% | +0.19% | 0.0% | | imaginary/queens || | 1.010e10 | 0.0% | -0.12% | 0.0% | -0.25% | 0.0% | -0.09% | 0.0% | | imaginary/rfib || | 5.300e9 | 0.0% | +0.03% | 0.0% | -0.02% | 0.0% | +0.02% | 0.0% | | imaginary/tak || | 5.859e9 | 0.0% | +0.10% | 0.0% | +0.20% | 0.0% | +0.18% | 0.0% | | imaginary/wheel-sieve1 || | 6.609e9 | 0.0% | -0.22% | 0.0% | +0.50% | 0.0% | -0.30% | 0.0% | | imaginary/wheel-sieve2 || | 6.321e9 | 0.0% | -0.60% | 0.0% | -1.03% | 0.0% | -0.23% | 0.0% | | imaginary/x2n1 || | 6.895e9 | 0.0% | -0.22% | 0.0% | -0.49% | 0.0% | +0.23% | 0.0% | | parallel/blackscholes || | 7.941e9 | 0.0% | -0.20% | 0.0% | -0.27% | 0.0% | -0.06% | 0.0% | | parallel/coins || | 1.196e10 | 0.0% | +0.47% | 0.0% | +1.49% | 0.0% | +0.68% | 0.0% | | parallel/gray || | 6.167e9 | 0.0% | -0.76% | 0.0% | -1.07% | 0.0% | -1.09% | 0.0% | | parallel/mandel || | 3.031e10 | 0.0% | +0.20% | 0.0% | +8.62% | 0.0% | +8.67% | 0.0% | | parallel/matmult || | 1.129e10 | 0.0% | -0.46% | 0.0% | -0.50% | 0.0% | -0.95% | 0.0% | | parallel/minimax || | 3.861e10 | 0.0% | -0.04% | 0.0% | -0.02% | 0.0% | -0.07% | 0.0% | | parallel/nbody || | 7.892e8 | 0.0% | +1.76% | 0.0% | +0.38% | 0.0% | +0.19% | 0.0% | | parallel/parfib || | 2.232e10 | 0.0% | -0.01% | 0.0% | -0.10% | 0.0% | -0.01% | 0.0% | | parallel/partree || | 7.399e9 | 0.0% | -0.78% | 0.0% | -0.95% | 0.0% | -0.57% | 0.0% | | parallel/prsa || | 1.765e10 | 0.0% | +0.08% | 0.0% | +3.16% | 0.0% | +3.21% | 0.0% | | parallel/queens || | 1.037e10 | 0.0% | +0.06% | 0.0% | +0.22% | 0.0% | +0.02% | 0.0% | | parallel/ray || | 1.298e10 | 0.0% | +0.17% | 0.0% | +0.02% | 0.0% | +0.01% | 0.0% | | parallel/sumeuler || | 3.511e9 | 0.0% | -0.64% | 0.0% | -0.69% | 0.0% | -0.51% | 0.0% | | parallel/transclos || | 1.992e10 | 0.0% | -0.38% | 0.0% | +0.37% | 0.0% | +0.19% | 0.0% | | real/anna || | 5.384e9 | 0.0% | -0.76% | 0.0% | -0.19% | 0.0% | -0.55% | 0.0% | | real/ben-raytrace || | 1.188e11 | 0.0% | +0.13% | 0.0% | +0.13% | 0.0% | +0.11% | 0.0% | | real/bspt || | 9.249e9 | 0.0% | -0.29% | 0.0% | -0.51% | 0.0% | -0.37% | 0.0% | | real/cacheprof || | 6.802e9 | 0.0% | -0.22% | 0.0% | -0.37% | 0.0% | -0.83% | 0.0% | | real/compress || | 5.822e9 | 0.0% | -0.29% | 0.0% | +0.03% | 0.0% | -0.06% | 0.0% | | real/compress2 || | 4.564e9 | 0.0% | +0.34% | 0.0% | +0.19% | 0.0% | +0.38% | 0.0% | | real/eff/CS || | 7.644e8 | 0.0% | -1.44% | 0.0% | -0.29% | 0.0% | -0.08% | 0.0% | | real/eff/CSD || | 3.775e9 | 0.0% | -0.99% | 0.0% | -0.84% | 0.0% | -0.93% | 0.0% | | real/eff/FS || | 2.809e9 | 0.0% | -0.45% | 0.0% | -0.73% | 0.0% | +0.22% | 0.0% | | real/eff/S || | 4.799e9 | 0.0% | -0.60% | 0.0% | +0.93% | 0.0% | -1.54% | 0.0% | | real/eff/VS || | 2.306e9 | 0.0% | -1.01% | 0.0% | +0.23% | 0.0% | -1.36% | 0.0% | | real/eff/VSD || | 6.791e7 | 0.0% | +24.00% | 0.0% | +28.52% | 0.0% | +24.00% | 0.0% | | real/eff/VSM || | 9.673e8 | 0.0% | +0.52% | 0.0% | -0.29% | 0.0% | +0.23% | 0.0% | | real/fem || | 6.549e9 | 0.0% | -0.84% | 0.0% | +1.28% | 0.0% | +1.27% | 0.0% | | real/fluid || | 3.939e9 | 0.0% | +0.09% | 0.0% | +0.44% | 0.0% | +0.11% | 0.0% | | real/fulsom || | 2.514e9 | 0.0% | -0.42% | 0.0% | -0.37% | 0.0% | -0.84% | 0.0% | | real/gamteb || | 9.663e9 | 0.0% | -0.45% | 0.0% | +4.33% | 0.0% | +4.29% | 0.0% | | real/gg || | 7.179e9 | 0.0% | -0.06% | 0.0% | -0.38% | 0.0% | -1.36% | 0.0% | | real/grep || | 7.578e9 | 0.0% | -0.19% | 0.0% | -0.02% | 0.0% | -0.37% | 0.0% | | real/hidden || | 7.205e9 | 0.0% | +0.08% | 0.0% | +0.33% | 0.0% | +0.18% | 0.0% | | real/hpg || | 4.948e9 | 0.0% | -0.57% | 0.0% | -0.10% | 0.0% | -0.79% | 0.0% | | real/infer || | 8.212e9 | 0.0% | +0.72% | 0.0% | +3.02% | 0.0% | +1.27% | 0.0% | | real/lift || | 4.430e9 | 0.0% | -0.27% | 0.0% | +0.08% | 0.0% | +0.00% | 0.0% | | real/linear || | 8.463e9 | 0.0% | -0.35% | 0.0% | -0.48% | 0.0% | -0.27% | 0.0% | | real/maillist || | 3.933e9 | 0.0% | +2.76% | 0.0% | -0.14% | 0.0% | +8.16% | 0.0% | | real/mkhprog || | 3.035e7 | 0.0% | -3.95% | 0.0% | +3.30% | 0.0% | -3.37% | 0.0% | | real/parser || | 5.201e9 | 0.0% | -0.62% | 0.0% | -1.04% | 0.0% | -0.98% | 0.0% | | real/pic || | 3.259e9 | 0.0% | +0.03% | 0.0% | +0.76% | 0.0% | +0.22% | 0.0% | | real/prolog || | 5.697e9 | 0.0% | -0.21% | 0.0% | +0.35% | 0.0% | +0.45% | 0.0% | | real/reptile || | 4.984e9 | 0.0% | -0.95% | 0.0% | -1.03% | 0.0% | -0.54% | 0.0% | | real/rsa || | 7.033e9 | 0.0% | -0.34% | 0.0% | +2.87% | 0.0% | +3.03% | 0.0% | | real/scs || | 6.103e9 | 0.0% | -0.42% | 0.0% | +1.59% | 0.0% | +1.46% | 0.0% | | real/smallpt || | 9.707e9 | 0.0% | +0.08% | 0.0% | +0.07% | 0.0% | +0.14% | 0.0% | | real/symalg || | 8.238e9 | 0.0% | -0.10% | 0.0% | -0.08% | 0.0% | +0.03% | 0.0% | | real/veritas || | 4.852e9 | 0.0% | -0.27% | 0.0% | -0.14% | 0.0% | -0.02% | 0.0% | | shootout/binary-trees || | 1.044e10 | 0.0% | +0.16% | 0.0% | +0.09% | 0.0% | +0.25% | 0.0% | | shootout/fannkuch-redux || | 1.905e10 | 0.0% | +0.11% | 0.0% | -0.10% | 0.0% | +0.06% | 0.0% | | shootout/fasta || | 4.270e9 | 0.0% | -0.18% | 0.0% | +0.23% | 0.0% | +0.58% | 0.0% | | shootout/k-nucleotide || | 4.272e9 | 0.0% | -1.49% | 0.0% | +0.19% | 0.0% | -0.93% | 0.0% | | shootout/n-body || | 3.712e9 | 0.0% | -0.03% | 0.0% | +0.07% | 0.0% | -0.11% | 0.0% | | shootout/pidigits || | 8.102e9 | 0.0% | +0.17% | 0.0% | -0.31% | 0.0% | -0.13% | 0.0% | | shootout/reverse-complement || | 4302843.000 | 0.0% | +0.94% | 0.0% | -1.45% | 0.0% | -2.32% | 0.0% | | shootout/spectral-norm || | 4.236e9 | 0.0% | +0.34% | 0.0% | +1.05% | 0.0% | +0.28% | 0.0% | | smp/callback001 || | 7.371e9 | 0.0% | +171.92% | 0.0% | -77.81% | 0.0% | -78.11% | 0.0% | | smp/callback002 || | 2.641e9 | 0.0% | -0.74% | 0.0% | -0.86% | 0.0% | -0.66% | 0.0% | | smp/chan || | 7.849e9 | 0.0% | -2.74% | 0.0% | -0.13% | 0.0% | -1.64% | 0.0% | | smp/sieve || | 2.346e9 | 0.0% | +1.25% | 0.0% | +2.21% | 0.0% | +0.26% | 0.0% | | smp/threads001 || | 5.220e9 | 0.0% | -0.32% | 0.0% | +0.24% | 0.0% | -0.19% | 0.0% | | smp/threads003 || | 2.903e9 | 0.0% | +0.99% | 0.0% | +2.82% | 0.0% | +1.18% | 0.0% | | smp/threads006 || | 1.106e9 | 0.0% | -1.24% | 0.0% | +1.04% | 0.0% | -1.59% | 0.0% | | smp/threads007 || | 4.372e9 | 0.0% | +1.01% | 0.0% | +2.24% | 0.0% | +2.18% | 0.0% | | spectral/ansi || | 5.976e9 | 0.0% | -0.17% | 0.0% | -0.28% | 0.0% | -0.42% | 0.0% | | spectral/atom || | 5.979e9 | 0.0% | -0.51% | 0.0% | -1.40% | 0.0% | -2.10% | 0.0% | | spectral/awards || | 8.409e9 | 0.0% | +0.23% | 0.0% | +3.54% | 0.0% | +3.60% | 0.0% | | spectral/banner || | 8.550e9 | 0.0% | -0.15% | 0.0% | +0.00% | 0.0% | -0.10% | 0.0% | | spectral/boyer || | 7.664e9 | 0.0% | +0.20% | 0.0% | -0.04% | 0.0% | -0.11% | 0.0% | | spectral/boyer2 || | 9.231e9 | 0.0% | +0.11% | 0.0% | +0.07% | 0.0% | +0.13% | 0.0% | | spectral/calendar || | 7.483e9 | 0.0% | -0.08% | 0.0% | +0.00% | 0.0% | +0.34% | 0.0% | | spectral/cichelli || | 8.724e9 | 0.0% | -0.29% | 0.0% | -0.41% | 0.0% | -0.44% | 0.0% | | spectral/circsim || | 6.953e9 | 0.0% | -0.60% | 0.0% | -0.17% | 0.0% | -0.58% | 0.0% | | spectral/clausify || | 6.223e9 | 0.0% | -0.06% | 0.0% | -0.32% | 0.0% | -0.18% | 0.0% | | spectral/constraints || | 8.435e9 | 0.0% | -1.18% | 0.0% | -1.28% | 0.0% | -1.12% | 0.0% | | spectral/cryptarithm1 || | 7.827e9 | 0.0% | -0.47% | 0.0% | -0.36% | 0.0% | -0.67% | 0.0% | | spectral/cryptarithm2 || | 6.766e9 | 0.0% | -0.06% | 0.0% | -0.07% | 0.0% | +0.01% | 0.0% | | spectral/cse || | 6.373e9 | 0.0% | -0.09% | 0.0% | -0.25% | 0.0% | -0.04% | 0.0% | | spectral/dom-lt || | 7.741e9 | 0.0% | +0.00% | 0.0% | +0.73% | 0.0% | +0.40% | 0.0% | | spectral/eliza || | 8.256e9 | 0.0% | +0.18% | 0.0% | +1.53% | 0.0% | +1.42% | 0.0% | | spectral/exact-reals || | 5.803e9 | 0.0% | -0.09% | 0.0% | -2.60% | 0.0% | -2.98% | 0.0% | | spectral/expert || | 5.080e9 | 0.0% | +0.13% | 0.0% | -0.11% | 0.0% | +0.28% | 0.0% | | spectral/fft2 || | 8.721e9 | 0.0% | +0.19% | 0.0% | +0.15% | 0.0% | +0.78% | 0.0% | | spectral/fibheaps || | 8.177e9 | 0.0% | -0.19% | 0.0% | -0.03% | 0.0% | -0.35% | 0.0% | | spectral/fish || | 7.197e9 | 0.0% | +0.21% | 0.0% | +0.31% | 0.0% | +0.33% | 0.0% | | spectral/gcd || | 6.130e9 | 0.0% | +0.19% | 0.0% | +4.24% | 0.0% | +4.09% | 0.0% | | spectral/hartel/comp_lab_zift || | 8.763e9 | 0.0% | -0.70% | 0.0% | +0.53% | 0.0% | -0.46% | 0.0% | | spectral/hartel/event || | 8.990e9 | 0.0% | -0.31% | 0.0% | -0.57% | 0.0% | -0.14% | 0.0% | | spectral/hartel/fft || | 3.273e9 | 0.0% | -0.30% | 0.0% | -1.13% | 0.0% | -1.01% | 0.0% | | spectral/hartel/genfft || | 1.039e10 | 0.0% | -0.05% | 0.0% | +0.37% | 0.0% | -0.30% | 0.0% | | spectral/hartel/ida || | 5.411e9 | 0.0% | -0.09% | 0.0% | +0.21% | 0.0% | +0.02% | 0.0% | | spectral/hartel/listcompr || | 6.129e9 | 0.0% | +0.14% | 0.0% | +0.37% | 0.0% | +0.33% | 0.0% | | spectral/hartel/listcopy || | 6.774e9 | 0.0% | -0.23% | 0.0% | -0.36% | 0.0% | -0.09% | 0.0% | | spectral/hartel/nucleic2 || | 4.396e9 | 0.0% | +0.33% | 0.0% | +0.89% | 0.0% | +0.39% | 0.0% | | spectral/hartel/parstof || | 5.544e9 | 0.0% | -0.14% | 0.0% | -0.33% | 0.0% | +0.01% | 0.0% | | spectral/hartel/sched || | 5.799e9 | 0.0% | +0.20% | 0.0% | +0.14% | 0.0% | +0.05% | 0.0% | | spectral/hartel/solid || | 5.859e9 | 0.0% | +0.08% | 0.0% | -0.25% | 0.0% | -0.01% | 0.0% | | spectral/hartel/transform || | 5.743e9 | 0.0% | -0.63% | 0.0% | -0.47% | 0.0% | -0.55% | 0.0% | | spectral/hartel/typecheck || | 5.792e9 | 0.0% | -0.01% | 0.0% | +0.13% | 0.0% | +0.11% | 0.0% | | spectral/hartel/wang || | 8.732e9 | 0.0% | -0.62% | 0.0% | +0.76% | 0.0% | -0.13% | 0.0% | | spectral/hartel/wave4main || | 6.057e9 | 0.0% | -0.23% | 0.0% | -0.14% | 0.0% | -0.02% | 0.0% | | spectral/integer || | 6.965e9 | 0.0% | -0.01% | 0.0% | +1.57% | 0.0% | +1.31% | 0.0% | | spectral/knights || | 8.300e9 | 0.0% | -0.17% | 0.0% | -0.16% | 0.0% | -0.52% | 0.0% | | spectral/lambda || | 8.087e9 | 0.0% | -0.36% | 0.0% | +0.80% | 0.0% | -0.29% | 0.0% | | spectral/last-piece || | 1.889e9 | 0.0% | -0.85% | 0.0% | -0.72% | 0.0% | -1.27% | 0.0% | | spectral/lcss || | 8.744e9 | 0.0% | +0.30% | 0.0% | +0.46% | 0.0% | +0.16% | 0.0% | | spectral/life || | 9.137e9 | 0.0% | -0.15% | 0.0% | -0.10% | 0.0% | +0.02% | 0.0% | | spectral/mandel || | 6.444e9 | 0.0% | -0.10% | 0.0% | +8.99% | 0.0% | +9.37% | 0.0% | | spectral/mandel2 || | 3.935e9 | 0.0% | -0.26% | 0.0% | +0.45% | 0.0% | +0.80% | 0.0% | | spectral/mate || | 7.576e9 | 0.0% | +0.26% | 0.0% | +0.05% | 0.0% | +0.39% | 0.0% | | spectral/minimax || | 8.625e9 | 0.0% | +0.15% | 0.0% | +0.38% | 0.0% | +0.49% | 0.0% | | spectral/multiplier || | 5.977e9 | 0.0% | +0.06% | 0.0% | +0.22% | 0.0% | +0.05% | 0.0% | | spectral/para || | 6.609e9 | 0.0% | -0.29% | 0.0% | -0.35% | 0.0% | -0.48% | 0.0% | | spectral/power || | 6.333e9 | 0.0% | -0.16% | 0.0% | +6.44% | 0.0% | +6.46% | 0.0% | | spectral/pretty || | 4452368.000 | 0.0% | +2.04% | 0.0% | +0.89% | 0.0% | -0.65% | 0.0% | | spectral/primetest || | 7.359e9 | 0.0% | +0.15% | 0.0% | +0.92% | 0.0% | +0.89% | 0.0% | | spectral/puzzle || | 6.600e9 | 0.0% | -0.26% | 0.0% | -13.37% | 0.0% | -13.70% | 0.0% | | spectral/rewrite || | 5.556e9 | 0.0% | -0.07% | 0.0% | -0.41% | 0.0% | -0.38% | 0.0% | | spectral/scc || | 4382747.000 | 0.0% | -0.51% | 0.0% | -3.04% | 0.0% | -1.84% | 0.0% | | spectral/simple || | 1.009e10 | 0.0% | -0.20% | 0.0% | +0.94% | 0.0% | +1.01% | 0.0% | | spectral/sorting || | 9.068e9 | 0.0% | -0.60% | 0.0% | +0.06% | 0.0% | -0.34% | 0.0% | | spectral/sphere || | 5.342e9 | 0.0% | +0.52% | 0.0% | +0.38% | 0.0% | +0.57% | 0.0% | | spectral/treejoin || | 9.865e9 | 0.0% | -0.04% | 0.0% | +0.13% | 0.0% | -0.65% | 0.0% | +===============================++==+=============+===========+================+===========+==================+===========+===================+===========+ | geom mean || | | +0.73% | -0.50% | -0.70% | | | | | +-------------------------------++--+-------------+-----------+----------------+-----------+------------------+-----------+-------------------+-----------+ ```
nomeata commented 3 years ago

Thanks for running that! Yeah, the joy of benchmarking… The code of smp/callback001 is unchanged between master andmaster2`, right? That seems to indicate that something is seriously wrong with the result :-(

phadej commented 3 years ago

@nomeata

The code of smp/callback001 is unchanged between master and master2

Well, the RTS and base are changed, but the benchmark itself no.

The benchmark begins with:

{-# LANGUAGE ForeignFunctionInterface #-}
-- This benchmark is also ffi014 in the test suite.

-- This program behaves unpredictably with the non-threaded RTS,
-- because depending on when the context switches happen it might end
-- up building a deep stack of callbacks.  When this happens, the run
-- queue gets full of threads that have finished but cannot exit
-- because they do not belong to the topmost call to schedule(), and
-- the scheduler repeatedly traverses the run queue full of these
-- zombie threads.

so that is maybe what happened.

christiaanb commented 3 years ago

So I originally thought the combination of

  1. Not being able inline \x y -> not (x == y) in the subject of the case-expression
  2. (In)Equality for primitive types would

Would result into a performance degradation, because the not can neither be absorbed into the equality, nor can it be absorbed into the case-expression (by switching the alternatives around). i.e. the case-expression corresponding to not would remain.

So I created the following to test my hypothesis:

{-# OPTIONS_GHC -O -ddump-simpl -ddump-to-file #-}
import Criterion
import Criterion.Main
import Data.Word

filterX :: (a -> Bool) -> [a] -> [a]
filterX _pred []    = []
filterX pred (x:xs)
  | pred x         = x : filterX pred xs
  | otherwise      = filterX pred xs
{-# NOINLINE filterX #-}

eq1 :: [Word16] -> Int
eq1 (x:xs) = length (filterX (/= x) xs)
{-# NOINLINE eq1 #-}

eq2 :: [Word16] -> Int
eq2 (x:xs) = length (filterX (not . (== x)) xs)
{-# NOINLINE eq2 #-}

eq1Bench :: Benchmark
eq1Bench = env setup $ \m ->
  bench "(/= x)" (nf eq1 m)
  where
    setup = return [0 .. 4000]

eq2Bench :: Benchmark
eq2Bench = env setup $ \m ->
  bench "(not . (== x))" (nf eq2 m)
  where
    setup = return [0 .. 4000]

main :: IO ()
main =
  defaultMain
    [ eq1Bench
    , eq2Bench
    ]

And sure enough if we look at the core we see the case-expression corresponding to not isn't optimized away in the worker for eq2 The case expression corresponding to the not will however be inlined into the equality operator for the lifted Word16 arguments, and we see that the branches of the case expressions in the worker for eq2 are switched around:

Main.$weq1
  = \ (w_s4Z9 :: [Word16]) ->
      case w_s4Z9 of {
        [] -> case lvl1_r51e of wild1_00 { };
        : x_a1HD xs_a1HE ->
          GHC.List.$wlenAcc
            @ Word16
            (filterX_r1jq
               @ Word16
               (\ (ds_d36N :: Word16) -> GHC.Word.neWord16 ds_d36N x_a1HD)
               xs_a1HE)
            0#
      }

Main.$weq2
  = \ (w_s4Ze :: [Word16]) ->
      case w_s4Ze of {
        [] -> case lvl3_r51g of wild1_00 { };
        : x_a2CH xs_a2CI ->
          GHC.List.$wlenAcc
            @ Word16
            (filterX_r1jq
               @ Word16
               (\ (x1_a3er :: Word16) ->
                  case x1_a3er of { GHC.Word.W16# x2_a3i6 ->
                  case x_a2CH of { GHC.Word.W16# y_a3i9 ->
                  case GHC.Prim.eqWord# x2_a3i6 y_a3i9 of {
                    __DEFAULT -> GHC.Types.True;
                    1# -> GHC.Types.False
                  }
                  }
                  })
               xs_a2CI)
            0#
      }

However, actually running the benchmark gave the following surprising result:

benchmarking (/= x)
time                 29.01 μs   (28.95 μs .. 29.07 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 29.10 μs   (29.07 μs .. 29.15 μs)
std dev              131.3 ns   (96.10 ns .. 191.5 ns)

benchmarking (not . (== x))
time                 26.85 μs   (26.81 μs .. 26.90 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 26.85 μs   (26.83 μs .. 26.88 μs)
std dev              76.05 ns   (47.50 ns .. 117.4 ns)

The version with the to eqWord# is faster than the single-call to the neWord16. Perhaps reading the CMM can give additional insights as to why the neWord16 version is slower, but I personally can't read CMM well. I'm also surprised neWord16 isn't inlined, even though it's saturated and there being a {-# INLINE [1] neWord16 #-} pragma.