Closed MangoIV closed 1 year ago
Why not ($) = id
?
I thought it would be nicer not to rely on the properties of id
, e.g. if we wanted to change ($)
to be multiplicity polymorphic, we would first have to change id
to be. Also, same number of chars, so, we neither win nor lose anything.
Why not
($) = id
?
I remember discussion why we don't have ($) :: a -> a; ($) = \x -> x
. That would solve plenty of problems (e.g. it becomes multiplicity polymorphic as well, i.e. maps ordinary functions to ordinary functions and linear functions to linear functions). https://github.com/ghc-proposals/ghc-proposals/pull/274#issuecomment-529894345 - And I actually don't buy an argument that having ($)
with "restricted" type helps type-inference/errors that much.
And I actually don't buy an argument that having ($) with "restricted" type helps type-inference/errors that much.
it might be worth considering that there might be beginners who do not immediately understand how ($) :: a -> a
can be specialized to the current type of ($)
. Given that the implementer of the current ($)
neither seemed to have realized this, that doesn't seem to be completely absurd.
it might be worth considering that there might be beginners who do not immediately understand how ($) :: a -> a can be specialized to the current type of
Beginner argument can be used to argue against you as well: ($)
type should stay at (be reverted to) forall (a b :: Type). (a -> b) -> (a -> b)
type. If you need fancier representation, multiplicity, etc generalization, define your own.
E.g. in haddocks it is currently displayed in its full glory
Terrible for beginners, and you want to make it even worse!
E.g. in haddocks it is currently displayed in its full glory
Right, I’d really consider this a problem in the UI of haddock, tbh, given that :t doesn’t yield the most general type without flags, but I get your point.
To be honest, the haddocks for ($)
are already not great even outside of the displayed type signature. It's flatly wrong and confusing to say that ($)
is redundant.
@dixonary please ping me when you open a merge request to improve these docs, I'll review it :)
It's flatly wrong and confusing to say that
($)
is redundant
Right, one may as well say any function is redundant, since you can replace it with its body.
I very much like ($) :: a -> a
. One thing to look out for is how this change will affect arity analysis. Today, (f $)
is analysed as having an arity of at least 1, even if nothing is known about f
. If ($) = id
, then (f $) = f
. That's sure to be an improvement in many cases, but might it be worse in others?
@dixonary documentation ticket opened at https://gitlab.haskell.org/ghc/ghc/-/issues/22963
I very much like
($) :: a -> a
. One thing to look out for is how this change will affect arity analysis. Today,(f $)
is analysed as having an arity of at least 1, even if nothing is known aboutf
. If($) = id
, then(f $) = f
. That's sure to be an improvement in many cases, but might it be worse in others?
Would that break something like ($ 5) <$> [(5 +), (5 -), (5 *)]
? Or does that just get reduced to (\x -> ($) x 5)
?
@Vlix It wouldn't break anything (aside from potential type inference issues). The question is how it could affect performance. It won't change performance in your example.
aside from potential type inference issues
Seems like a huge footnote.
Wrt the actual proposal, can we get the input from @goldfirere since he's been involved in that?
I also find there's lack of motivation. The discourse thread links to a theoretical code example.
However, that's not really sufficient motivation, IMO. Do users use existing workarounds (such as re-defining ($)
) right now due to this issue? What is the actual value, regardless of "we could have been more general"?
Even if it doesn't break anything and has zero performance impact... it will change the type signature of one of the most used functions in Haskell... again. This shouldn't be something we regularly iterate over.
I run into this issue in a non+theoretical example recently.
The definition of ResultType
in ghc
was changed to be unlifted type (using UnliftedNewtypes
and UnboxedSums
), the pattern synonyms made it look almost like the previous definition. However I had to change
-fromParseResult dflags $ parseFile (toFilePath modpath) dflags contents'
+fromParseResult dflags (parseFile (toFilePath modpath) dflags contents')
Yes, there's an easy work-around, but I did wonder why the previous definition just not work out of the box.
This shouldn't be something we regularly iterate over.
That's why change to ($) :: a -> a
is tempting!
I also find there's lack of motivation. The discourse thread links to a theoretical code example.
@hasufell its not purely theoretic (although I guess I should make a better example), I encountered this issue when I tried to pass a Proxy# instead of a Proxy and use ($)
for that. So the motivation is not theoretical but really practical. The reason why I even thought about it is because of something I need not kindchecking.
Wrt changing it to a -> a
, I just have to reiterate, it's tempting but I think it wouldn't solve as many issues as @phadej is saying at the cost of being really confusing to newcomers. E.g. it would now allow to pass multiplicity polymorphic functions but it would itself not be multiplicity polymorphic. I'd say my proposal is a lot less intrusive than that. I understand that it's annoying that we'd have to change the implementation of ($)
but I think the signature as it is now is a bug, it's inconsistent and it's unnecessary and it can cause unintuitive issues.
Also wrt @phadej's point, consider that we, after all, even have monadic versions of some functions that only need Applicative, I think this falls in the same range - although the simplification would be possible, it might cause confusion so we keep it.
I yesterday built GHC 9.4.4 with this change, I've yet to try it out on the clc-hackage
though, as it doesn't support ghc 9.4.4 yet (https://github.com/Bodigrim/clc-stackage/issues/6).
even have monadic versions of some functions that only need Applicative
This is not as simple as that.
liftM2
is defined using >>=
, which is different than using <$>
and <*>
; for some Monad
s that makes a difference.mapM
is a method of Traversable
, as there can be more efficient implementation (e.g. Vector
)OTOH, e.g. forever
was generalized to use Applicative
instead of Monad
, as having >>
as something else than *>
is probably not a good idea.
($) :: a -> a
serves the purpose of $ has low, right-associative binding precedence, so it sometimes allows parentheses to be omitted as well.
Instead of thinking of f $ ...
like "f applied to ...", you may think it as id f (...)
(which is f (...)
) but in infix form to omit parentheses.
As this was requested by @hasufell, here is the actual motivation for this and the reasoning summarized. I will attach this to the original post.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE TypeFamilies #-}
{-# OPTIONS_GHC -Wall #-}
module Dollar where
import Data.Kind (Type)
import GHC.Exts (Proxy#, TYPE, proxy#)
import Prelude hiding (($))
-- allows @a@ and @b@ to be representation polymorphic
-- doesn't allow the arrow from @a@ to @b@ to be multiplicity
-- polymorphic
($)
:: forall
repa
repb
(a :: TYPE repa)
(b :: TYPE repb)
. (a -> b)
-> a
-> b
($) f = f
{-# INLINE ($) #-}
infixr 0 $
-- does allow for a function @a -> b@ to be multiplicity and
-- representation polymorphic, but is itself not multiplicity
-- polymorphic until we have something like @id :: forall m a. a %m -> a@
-- this consideration would go away if we'd implement it like
-- @\x -> x@ (same argument for why to use @($) f = f@ instead of
-- @($) = id@)
dollar :: forall (a :: Type). a -> a
dollar = id
{-# INLINE dollar #-}
infixr 0 `dollar`
-- for a motivation on why this is useful and how I encountered
-- this issue with @($)@, consider the following (more contrived,
-- but less complicated than the original) example
type family TF a
class C a where
ca :: Proxy# a -> TF a
-- this would err when using the @($)@ provided by Prelude, although
-- it should literally act as @`id`@
ca' :: forall {proxy} a. C a => proxy a -> TF a
ca' _ = ca $ proxy# @a
https://paste.sr.ht/~mangoiv/6fef7284c59202b63de7f26c962ad605b39f7b7f
evil idea to solve @phadej issues - how about additionally proposing to give id
an infixr 0
?
@MangoIV some people would prefer an application operator that's infixl 0
.
I usually find free-flowing discussions very fruitful, but I know that it can be intimidating / overwhelming for proposers. As a point of order, @MangoIV, you are not obliged to answer every suggestion / generalization / counterproposal.
I yesterday built GHC 9.4.4 with this change, I've yet to try it out on the
clc-hackage
though, as it doesn't support ghc 9.4.4 yet (https://github.com/Bodigrim/clc-stackage/issues/6).
The latter is not going to change by magic. In the meantime you can backport your changes to ghc-9.2
branch and give it a try.
However, that's not really sufficient motivation, IMO. Do users use existing workarounds (such as re-defining
($)
) right now due to this issue? What is the actual value, regardless of "we could have been more general"?
I work a lot with low-level primops and it is very annoying indeed that one cannot use ($)
with Int#
, Word#
, etc. I can vouch that the pain is real.
@Bodigrim , how much pain would this change alleviate? Wouldn't (.)
still fail to work with those things?
I work a lot with low-level primops and it is very annoying indeed that one cannot use ($) with Int#, Word#, etc. I can vouch that the pain is real.
What do you usually do then? Use parentheses or locally define an operator?
What do you usually do then? Use parentheses or locally define an operator?
I was too stupid to realise that I can redefine ($)
locally in a representation-polymorphic way, so I used parentheses.
I often have to code around this issue also. Sadly most of the Haskell 'base' library doesn't work well with this stuff, so my code is often very custom.
I don't think you'll see much code that can use this in the wild until you fix the problems because it makes it very expensive to write in Haskell and end up as a custom parallel micro-ecosystem in my experience.
@MangoIV how would you like to proceed? Please raise an MR with your preferred version of ($)
and complete impact assessment to learn if any packages are broken by increased polymorphism.
Hi, I'd like to make an impact assessment, but should I build that against ghc 927 or am I blocked until clc stackage has been upgraded to 944 or 961 respectively? If so, I guess I'd volunteer helping to fix it.
You can use GHC 9.2.7, I would not expect a material difference for this proposal.
I guess this is the "success" screen:
[1 of 1] Compiling Lib ( src/Lib.hs, /home/mangoiv/Devel/clc-stackage/dist-newstyle/build/x86_64-linux/ghc-9.2.7/clc-stackage-0.1.0.0/build/Lib.o, /home/mangoiv/Devel/clc-stackage/dist-newstyle/build/x86_64-linux/ghc-9.2.7/clc-stackage-0.1.0.0/build/Lib.dyn_o )
gcc: fatal error: cannot execute ‘/nix/store/qm03mh5x2j4ls0y80y8h3zsyj4hb52w2-gcc-11.3.0/libexec/gcc/x86_64-unknown-linux-gnu/11.3.0/collect2’: execv: Argument list too long
compilation terminated.
`cc' failed in phase `Linker'. (Exit code: 1)
How do I proceed?
Yes, that's a "success" :)
Could you please raise a GHC MR with your changes?
I often have to code around this issue also. Sadly most of the Haskell 'base' library doesn't work well with this stuff, so my code is often very custom.
I don't think you'll see much code that can use this in the wild until you fix the problems because it makes it very expensive to write in Haskell and end up as a custom parallel micro-ecosystem in my experience.
Seconded. I've felt this pain very much.
I have opened an MR as requested
Dear CLC members. We have all prerequisites for the vote: the proposal, the motivational example, the impact assessment (all green), and the MR. There are also multiple testimonials in this thread from potential users.
Let's keep the floor open until the end of the week and use it as an opportunity to polish the proposal if there are any more questions / suggestions.
This is making ($)
more strict: seq (($) undefined) ()
will now be _|_
instead of ()
. This can potentially break existing code but I don't know if it actually does.
@monoidal thank you for the note; I built the clc stackage against this change and it didn't report any failure so at least none of the libraries in there rely on this. Can you elaborate why this makes it more strict? I don't think I understand it, tbh, is it generally more strict or is it due to how undefined
gets typed in your example (it's itself representation polymorphic afaiu)?
If you have
f :: Int -> Int
f = f
then seq (newdollar f) () = seq f () = _|_
while seq (olddollar f) = seq (\x -> f x) () = ()
. Eta expansion changes strictness. This is not related to representation polymorphism of undefined
.
Granted, it makes little sense to write ($) f
or (f $)
since you can just write f
. But this is a proposal changing strictness of one of the most used functions, that has been there for more than 20 years. I am not saying we can't do that, but the bar should be high. The fact that nobody has pointed out this fact for a month makes me uneasy.
Also note that the proposal affects VTA, although I'm less concerned about this.
hmm, could we perhaps introduce this change for ($!)
then, if the change in strictness definitely affects the runtime semantics of the programs?
($!)
forces its second argument, which means it cannot be made representation polymorphic.
Ah curse. thanks!
Eta expansion changes strictness. This is not related to representation polymorphism of undefined.
Right. Now that you say it it's obvious and I feel dumb. Thank you for noticing. This changes it quite a bit. I wonder now why nothing breaks. Are all test suites run when building the clc stackage as prescribed?
@monoidal perhaps you know; if this were to break nothing, would it improve performance?
TIL: I was today-years-old when I learned that you can't define levity-polymorphic variables in Haskell 😮💨
Another potential breaking change if someone does something weird with inferring the type of ($)
during code generation, and this code does no longer work because the AST of ($)
has changed.
Still, I think the benefits of making ($)
levity-polymorphic outweigh the drawbacks of supporting one weird use-case nobody probably uses (and unless someone writes a good example of why you may want to write ($) undefined
, I'm not convinced it's worth it to support backwards compatibility in this edge case).
So as long as this is documented, I'm +1 on this proposal.
The ($)
operator always had a tragic story when it looks like a syntactic sugar that helps removing extra ()
, in practice it requires special handling in multiple several places (ApplicativeDo
, HLS, maybe Haddock, now this levity-polymorphism). I appreciate any proposal that helps dropping f $ x
instead of f (x)
and make it as invisible to the end user as possible.
On a separate note, the linked GHC User Guide page make it sound like it's impossible to define ($)
fully levity-polymorphic, however, as this proposal shows, it is indeed possible with a minor change of semantics. So I would encourage to update the GHC User Guide regardless.
I'd still like a comment from @goldfirere
I'd say it's a pretty major concern because it may break programs silently. You could introduce a bug to someone's program after an upgrade and they get no automated warning for that before deploying.
I'm fairly negative on this proposal because I think interested parties should try it in a putative unlifted-base
package first, but I'm having trouble thinking of a realistic example where seq (($) f) _ == _|_
is a problem. Does anyone have one?
This is making
($)
more strict:seq (($) undefined) ()
will now be_|_
instead of()
. This can potentially break existing code but I don't know if it actually does.
Thanks for pointing this out. It's technically true, but it's just that eta-expansion and eta-reduction change laziness:
> undefined `seq` ()
*** Exception: Prelude.undefined
> (\x -> undefined x) `seq` ()
()
If anyone is at fault here, it is seq
, not $
.
As a funny manifestation of the similar phenomenon, the following program's output depends on -O
:
import GHC.Exts (build)
main = print $ foldr undefined () (build seq) `seq` ()
$ ghc -v0 Lazy.hs && ./Lazy
()
$ ghc -v0 -O Lazy.hs && ./Lazy
Lazy: Prelude.undefined
I'm fairly negative on this proposal because I think interested parties should try it in a putative
unlifted-base
package first
While recent GHCs extended the variety of unlifted data with {-# LANGUAGE UnliftedDatatypes, UnliftedNewtypes #-}
, primitive unlifted types such as Int#
are an integral part of base
since inception.
Thanks for pointing this out. It's technically true, but it's just that eta-expansion and eta-reduction change laziness
Sure, but that still seems like a huge caveat for one of the most used functions in Haskell.
I do not feel comfortable changing this behavior.
Can someone demonstrate at least relatively realistic example when this "huge caveat" matters? Do you ever seq
functions? Or supply literal undefined
as the first argument of $
?
OTOH, e.g. forever was generalized to use Applicative instead of Monad, as having >> as something else than *> is probably not a good idea.
This did result in a space leak in transformers
due to (*>)
having a default definition of liftA2 (flip const)
which would hold on to the value of the first computation until the second had completed - so sometimes trivial improvements like that can have surprising negative effects.
I am +1 on this proposal, including making ($) :: a -> a
. Types are a good starting point for documentation, but if the docs for $
are confusing, then we should improve them independently of this - and, as has already been pointed out, it is confusing.
I cannot imagine a scenario where the laziness of $
would matter. $
is almost more of a syntactic construct than a regular function - it's sole purpose is to avoid writing parentheses. If your code somehow relies on f $ g $ x
having different laziness behavior than f (g x)
then I want to hear about it!
Motivation
At the moment the type of
($)
is:($) :: forall r a (b :: TYPE r). (a -> b) -> a -> b
which means that
a
is not representation polymorphic. It is indeed not possible to make it fully representation polymorphic with the current implementation, which is:f $ x = f x
due to us not knowing what representationx
would be, see this note in the ghc user's guide, however, in the case of($)
this is an implementation detail, i.e. it could be implemented as($) f = f
which would make it unnecessary to pass a representation polymorphic argument.For some more motivation on why this is necessary, please refer to this paste.
Proposal
Like this, we could rewrite the
($)
to:(Indeed, if we could not write
($)
like this, we would even fail to prove (lifted)id
in Haskell, which would be really bad)Alternatives
As pointed out by participants to the conversation, there are multiple alternatives in different directions
Changes to the proposal
($) :: forall a. a -> a
, this will make the function passed to($)
possibly multiplicity polymorphic, this type might however be hard to understand for beginnersAlternatives without the proposal
unlifted-base
which doesn't require the change to actualbase
($)
locally: for this to be convenient, this will require an alternativePrelude
, e.g. with cabalmixins
and some conveniences will fall away e.g. the special rule that makes($)
impredicative everywhere($)
but parens: inconvenient