Closed s5bug closed 3 years ago
After playing around with typeclasses a lot more, I understand what I got wrong:
I'm used to Scala, where typeclasses are data, and "typeclass inheritance" is modeled by actual Java extends
.
In Amulet, the =>
doesn't mean extends
, but rather means "is required for": that is, instead of a => b
meaning b
provides a
, like how I expected coming from Scala, it means b
requires a
. So, it doesn't make sense to "override" a previous operator: the existence of that operator is already assumed.
I like the former definition more than the latter personally: it allows me to specify monad
with just pure
and (>>=)
, rather than having to go through the whole typeclass hierarchy all the way through monoidal
and invariant
. I.e., I just have a single instance monad option
instead of 9 instances building up to monad
. Is it worth a proposal?
I can't speak for amulet, but this is typically intended behavior for typeclasses, because all monads by definition are applicatives, all applicatives by definition are functors, etc..
For what it's worth, if you really dislike this way of doing things, you can always just use a record of functions; that is what typeclasses boil down to, anyways. You then lose the nice automatic instance resolution based on types and checks for orphan instances, but now you only have to implement the functions for that single record, and instances for overlapping types are no longer a problem (e.g. you could have two records that specify the semigroup operator and monoid identity for int, one for ( * ) and one for ( + )). Whether or not this is worth it depends on what you're doing, but I think in the broad majority of cases typeclasses are much nicer to work with.
but this is typically intended behavior for typeclasses, because all monads by definition are applicatives, all applicatives by definition are functors, etc..
The Scala way of doing things doesn't sacrifice this: behavior would stay the same, it would just be easier to define big instances. You do however lose a sort of coherence unless you require instances in a special way.
So, I dunno. I would definitely like a solution where I can override "later" operators with previous ones, especially in something like
class invariant 'f begin
val imap : 'f 'a -> ('a -> 'b) -> ('b -> 'a) -> 'f 'b
end
class invariant 'f => functor 'f begin
val map : 'f 'a -> ('a -> 'b) -> 'f 'b
let imap fa f g = map fa f
end
I think I understand now-- defining an instance for functor
in that example would also implicitly define the invariant
instance through the definition of imap
?
This raises a question: What happens if you define instances both for invariant and for functor? Any solution I can think of here working across module boundaries seems to involve some "spooky action at a distance" where the presence or lack of instance for a type in modules defining these classes could change the code that is ultimately used, unless defining instances for both is made an error (which sounds pretty awful; if someone defines a functor instance for a type that doesn't have an invariant instance in the module invariant was defined in, the maintainer of invariant has to point people at the functor module for that functionality, and if they want to add it themselves then they break users of the functor module's code.) This problem is compounded if you think about multiple classes that might extend invariant and then do their own definitions of imap
in their class definition; now you have orphan instances with some makeup on them to trick you into thinking they are not orphan instances.
Suppose we restrict this to only working for classes defined in the same file, so that all the module interaction headache is gone. There's still the issue of cases where perhaps you can define a class lower in the heirarchy using a higher up one but it may not be efficient to do so; for instance defining the functor instance for list in terms of >>=
and join
. In these cases you would also need to support syntax for overriding the automatic definition with a custom one (at which point you are only saving one to three lines versus having a separate instance.)
So the question boils down to: how often is this very restricted ability actually useful versus the amount of complexity it introduces to the implementation of the compiler, explanation to new users, and simplicity of documentation generation (because instances now no longer 1:1 correspond to instance sections)?
This raises a question: What happens if you define instances both for invariant and for functor?
In Scala, this would be fine at the definition site, and fine for a functor 'f =>
, but would cause problems ("implicit resolution conflict") as soon as you tried to resolve invariant 'f =>
due to there being two cases. If I recall this is an "open research problem".
In Scala this could error in the same way that current typeclass conflicts error, even with the problems you mentioned above: specific imports can be excluded, and since typeclass instances are data in Scala, you can exclude them. I'm not sure that the same applies to Amulet, so I don't have any ideas at the moment.
There's still the issue of cases where perhaps you can define a class lower in the heirarchy using a higher up one but it may not be efficient to do so;
In Scala, the user is still able to overwrite these things with a more performant implementation, i.e.:
class functor 'f begin
val map : 'f 'a -> ('a -> 'b) -> 'f 'b
end
class functor 'f => monad 'f begin
val pure : 'a -> 'f 'a
val flatmap : 'f 'a -> ('a -> 'f 'b) -> 'f 'b
val join : 'f ('f 'a) -> 'f 'a
let map fa f = flatmap fa (fun a -> pure (f a))
let join ffa = flatmap ffa (fun fa -> fa)
let flatmap fa f = join (map fa f)
end
(* for prototyping, you save tons of lines of code *)
instance monad option begin
let pure = Some
let flatmap fa f =
match fa with
| Some x -> f x
| None -> None
end
(* but once you need more efficiency, you can go back later *)
instance monad option begin
let pure = Some
let flatmap fa f =
match fa with
| Some x -> f x
| None -> None
let join fa =
match fa with
| Some (Some x) -> x
| y -> y
let map fa f =
match fa with
| Some x -> Some (f x)
| None -> None
end
And even in the "more efficient" version, I've saved code by not having to define everything in the typeclass hierarchy (and I've saved even more code if I go back later to define applicative
, invariant
, monoidal
, etc).
So the question boils down to: how often is this very restricted ability actually useful versus the amount of complexity it introduces to the implementation of the compiler, explanation to new users, and simplicity of documentation generation (because instances now no longer 1:1 correspond to instance sections)?
Yeah, through discussing this with various other people I've learned that this way of doing things is very not ML-like. I can't say anything about complexity or documentation generation (I haven't seen Amulet's documentation generation, and I'm not a Haskeller or a PLT person (yet)), but I do agree that the magnitude of a change like this would cause problems.
How viable would the "previous operator overriding" be, even if the typeclass model doesn't change? I.e., perhaps something like
instance semigroup int begin
forward ( |+| ) = ( + )
end
instance monoid int begin
forward empty = zero
end
instance semiring int begin
let zero = 0
external val ( + ) = "%int.add"
external val ( * ) = "%int.mul"
end
instance invariant option begin
forward imap fa f g = map fa f
end
I'm not very good at making good syntax or identifiers, and I'm worried about forward
conflicting with things, so there's probably some better way, but I'd love to brainstorm if it's viable.
One Haskell extension I recently stumbled upon is default superclass instances/intrinsic superclasses, which would give you a syntax a little like:
class monoid 'a => semiring 'a begin
val zero : 'a
val ( + ) : 'a -> 'a -> 'a
val ( * ) : 'a -> 'a -> 'a
deriving instance semigroup 'a begin
let a |+| b = a + b
end
end
instance monoid int
let zero = 0
(* Also ( + ), ( * ) *)
deriving semigroup
It feels a reasonable solution - it's still sound (you enforce the same instance
constraints as already), but is still shorter than deriving each instance separately.
First step though is obviously fixing the undefined
error so it actually reports something useful.
This gives
Is it not common in MLs to "override" a previous operator? Sorry for my lack of knowledge on the nomenclature.