amuletml / amulet

An ML-like functional programming language
https://amulet.works/
BSD 3-Clause "New" or "Revised" License
328 stars 16 forks source link

`undefined` error when trying to "fill in" a "previous" operator #291

Closed s5bug closed 3 years ago

s5bug commented 3 years ago
class semigroup 'a begin
  val ( |+| ) : 'a -> 'a -> 'a
end

class semigroup 'a => monoid 'a begin
  val empty : 'a
end

class monoid 'a => semiring 'a begin
  val zero : 'a
  val ( + ) : 'a -> 'a -> 'a
  val ( * ) : 'a -> 'a -> 'a

  let a |+| b = a + b
end

This gives

> amc compile .\main.ml -o main.lua
amc.exe: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries\base\GHC\Err.hs:80:14 in base:GHC.Err
  undefined, called at src\Types\Infer\Class.hs:199:41 in amuletml-1.0.0.0-3a680BLq9tuHyCMl6Obe0N:Types.Infer.Class

Is it not common in MLs to "override" a previous operator? Sorry for my lack of knowledge on the nomenclature.

s5bug commented 3 years ago

After playing around with typeclasses a lot more, I understand what I got wrong:

I'm used to Scala, where typeclasses are data, and "typeclass inheritance" is modeled by actual Java extends.

In Amulet, the => doesn't mean extends, but rather means "is required for": that is, instead of a => b meaning b provides a, like how I expected coming from Scala, it means b requires a. So, it doesn't make sense to "override" a previous operator: the existence of that operator is already assumed.

I like the former definition more than the latter personally: it allows me to specify monad with just pure and (>>=), rather than having to go through the whole typeclass hierarchy all the way through monoidal and invariant. I.e., I just have a single instance monad option instead of 9 instances building up to monad. Is it worth a proposal?

davidgarland commented 3 years ago

I can't speak for amulet, but this is typically intended behavior for typeclasses, because all monads by definition are applicatives, all applicatives by definition are functors, etc..

For what it's worth, if you really dislike this way of doing things, you can always just use a record of functions; that is what typeclasses boil down to, anyways. You then lose the nice automatic instance resolution based on types and checks for orphan instances, but now you only have to implement the functions for that single record, and instances for overlapping types are no longer a problem (e.g. you could have two records that specify the semigroup operator and monoid identity for int, one for ( * ) and one for ( + )). Whether or not this is worth it depends on what you're doing, but I think in the broad majority of cases typeclasses are much nicer to work with.

s5bug commented 3 years ago

but this is typically intended behavior for typeclasses, because all monads by definition are applicatives, all applicatives by definition are functors, etc..

The Scala way of doing things doesn't sacrifice this: behavior would stay the same, it would just be easier to define big instances. You do however lose a sort of coherence unless you require instances in a special way.

So, I dunno. I would definitely like a solution where I can override "later" operators with previous ones, especially in something like

class invariant 'f begin
  val imap : 'f 'a -> ('a -> 'b) -> ('b -> 'a) -> 'f 'b
end

class invariant 'f => functor 'f begin
  val map : 'f 'a -> ('a -> 'b) -> 'f 'b

  let imap fa f g = map fa f
end
davidgarland commented 3 years ago

I think I understand now-- defining an instance for functor in that example would also implicitly define the invariant instance through the definition of imap?

This raises a question: What happens if you define instances both for invariant and for functor? Any solution I can think of here working across module boundaries seems to involve some "spooky action at a distance" where the presence or lack of instance for a type in modules defining these classes could change the code that is ultimately used, unless defining instances for both is made an error (which sounds pretty awful; if someone defines a functor instance for a type that doesn't have an invariant instance in the module invariant was defined in, the maintainer of invariant has to point people at the functor module for that functionality, and if they want to add it themselves then they break users of the functor module's code.) This problem is compounded if you think about multiple classes that might extend invariant and then do their own definitions of imap in their class definition; now you have orphan instances with some makeup on them to trick you into thinking they are not orphan instances.

Suppose we restrict this to only working for classes defined in the same file, so that all the module interaction headache is gone. There's still the issue of cases where perhaps you can define a class lower in the heirarchy using a higher up one but it may not be efficient to do so; for instance defining the functor instance for list in terms of >>= and join. In these cases you would also need to support syntax for overriding the automatic definition with a custom one (at which point you are only saving one to three lines versus having a separate instance.)

So the question boils down to: how often is this very restricted ability actually useful versus the amount of complexity it introduces to the implementation of the compiler, explanation to new users, and simplicity of documentation generation (because instances now no longer 1:1 correspond to instance sections)?

s5bug commented 3 years ago

This raises a question: What happens if you define instances both for invariant and for functor?

In Scala, this would be fine at the definition site, and fine for a functor 'f =>, but would cause problems ("implicit resolution conflict") as soon as you tried to resolve invariant 'f => due to there being two cases. If I recall this is an "open research problem".

In Scala this could error in the same way that current typeclass conflicts error, even with the problems you mentioned above: specific imports can be excluded, and since typeclass instances are data in Scala, you can exclude them. I'm not sure that the same applies to Amulet, so I don't have any ideas at the moment.

There's still the issue of cases where perhaps you can define a class lower in the heirarchy using a higher up one but it may not be efficient to do so;

In Scala, the user is still able to overwrite these things with a more performant implementation, i.e.:

class functor 'f begin
  val map : 'f 'a -> ('a -> 'b) -> 'f 'b
end

class functor 'f => monad 'f begin
  val pure : 'a -> 'f 'a
  val flatmap : 'f 'a -> ('a -> 'f 'b) -> 'f 'b

  val join : 'f ('f 'a) -> 'f 'a

  let map fa f = flatmap fa (fun a -> pure (f a))
  let join ffa = flatmap ffa (fun fa -> fa)
  let flatmap fa f = join (map fa f)
end

(* for prototyping, you save tons of lines of code *)
instance monad option begin
  let pure = Some
  let flatmap fa f =
    match fa with
    | Some x -> f x
    | None -> None
end

(* but once you need more efficiency, you can go back later *)
instance monad option begin
  let pure = Some
  let flatmap fa f =
    match fa with
    | Some x -> f x
    | None -> None
  let join fa =
    match fa with
    | Some (Some x) -> x
    | y -> y
  let map fa f =
    match fa with
    | Some x -> Some (f x)
    | None -> None
end

And even in the "more efficient" version, I've saved code by not having to define everything in the typeclass hierarchy (and I've saved even more code if I go back later to define applicative, invariant, monoidal, etc).

So the question boils down to: how often is this very restricted ability actually useful versus the amount of complexity it introduces to the implementation of the compiler, explanation to new users, and simplicity of documentation generation (because instances now no longer 1:1 correspond to instance sections)?

Yeah, through discussing this with various other people I've learned that this way of doing things is very not ML-like. I can't say anything about complexity or documentation generation (I haven't seen Amulet's documentation generation, and I'm not a Haskeller or a PLT person (yet)), but I do agree that the magnitude of a change like this would cause problems.


How viable would the "previous operator overriding" be, even if the typeclass model doesn't change? I.e., perhaps something like

instance semigroup int begin
  forward ( |+| ) = ( + )
end

instance monoid int begin
  forward empty = zero
end

instance semiring int begin
  let zero = 0
  external val ( + ) = "%int.add"
  external val ( * ) = "%int.mul"
end

instance invariant option begin
  forward imap fa f g = map fa f
end

I'm not very good at making good syntax or identifiers, and I'm worried about forward conflicting with things, so there's probably some better way, but I'd love to brainstorm if it's viable.

SquidDev commented 3 years ago

One Haskell extension I recently stumbled upon is default superclass instances/intrinsic superclasses, which would give you a syntax a little like:

class monoid 'a => semiring 'a begin
  val zero : 'a
  val ( + ) : 'a -> 'a -> 'a
  val ( * ) : 'a -> 'a -> 'a

  deriving instance semigroup 'a begin
    let a |+| b = a + b
  end
end

instance monoid int
  let zero = 0
  (* Also ( + ), ( * ) *)
  deriving semigroup

It feels a reasonable solution - it's still sound (you enforce the same instance constraints as already), but is still shorter than deriving each instance separately.

First step though is obviously fixing the undefined error so it actually reports something useful.