eliminate instance boiler plate: seamless migration to type classes

cvogt commented 8 years ago

At the moment in Scala one has to choose between trait-based interfaces with less boiler-plate or type-class based interfaces that scale to more use cases. Real world code bases either use only one concept and suffer the consequences. Or they use both concepts as needed but suffer from less regularity, more controversy and signiciant mechanical refactoring when changing your mind in a case. Simulacrum reduces the required boiler plate overhead for type classes but does not eliminate it entirely.

This is a proposal for a small addition to Simulacrum that would remove the remaining boiler plate and make migration between trait based interfaces and type-class based interfaces much more seamless. In a way it is very much in the spirit of Scala to integrate these OO / FP concepts and it has very concrete practical benefits. And it would basically put Simulacrum on an equal level with trait-based interfaces with regards to syntactic overhead.

So we know that type classes are more flexible than OO interface implementation. This is especially true with generics. A Scala class can't implement Equals[Int] and Equals[String] at the same time. Implicit classes can help here, but they can't be inferred if the generic types depend on method arguments, not only on the left hand sides. Type-classes solve this problem. (Also see https://github.com/cvogt/traits-vs-type-classes/blob/master/traits-vs-type-classes.scala)

But even with Simulacrum the required boiler plate can be significant. This is particularly annoying when the use case at hand does not require the additional flexibility. Here is an example:

  @typeclass
  sealed trait Equals[T]{
    def ===(left: T, right: T): Boolean
  }
  case class Person( name: String )
  object Person{
    implicit object personEquals extends Equals[Person]{
      def ===(left: Person, right: Person): Boolean = left.name == right.name
    }
  }

requires more boiler plate than

  sealed trait Equals[R]{
    def ===(right: R): Boolean
  }
  case class Person( name: String ) extends Equals[Person]{
    def ===(other: Person) = this.name == other.name
  }

This adds up when one has to do it everywhere. So here is my proposal to get the best of both worlds. We could extend Simulacrum with the following annotation macro, that allows "implementing" type classes directly from data types.

  class implements[T] extends annotation.StaticAnnotation{ ... }

The usage would be as such:

  @typeclass
  sealed trait Equals[T]{
    def ===(left: T, right: T): Boolean
  }
  @implements[Equals[Person]]
  case class Person( name: String ){
    def ===(other: Person) = this.name == other.name
  }

This would automatically generate the following companion which simply forwards to the method on the class.

  object Person{
    implicit object EqualsInstance extends Equals[Person]{
      def ===(left: Person, right: Person): Boolean = left === right
    }
  }

@typeclass is the equivalent to OO class. @implements is the equivalent to extends.

This would cut down on boiler plate and would make migration between the two concepts much more seamless. It's conceivable to additionaly generate an actual interface trait allowing even easier migration.

yilinwei commented 8 years ago

I would actually go further and add the extends InstancesX malarkey generated as well by using the type constraints on the class.

Of course the major problem at the moment is that it's just not nice navigating the type tree, and it's really unfriendly to the IDE. Also, see #50 and the fact that not everyone agrees on the right way to encode a typeclass.

I would argue that the steps for this would vaguely be,

Simplify the current logic so it's easier to work with the type tree
Separate out the collecting of the meta-data to the encoding logic (see nasty nested type method creation etc...)
Promote simulacrum to a compiler plugin
Add new features

I'd also like to have an in depth look at scala.meta before doing any of this because the current logic is very much tied down to the old Symbol API.

cvogt commented 8 years ago

Sounds good. Just wanted to record this proposal in a good place and hopefully some day we'll get to it. Would give it a shot myself, if I wasn't tied up with cbt already :).

(cc @radsaggi, @clhodapp, @pedrofurla you also guys may wanna take a look at this. Maybe something fun for later in the year).

yilinwei commented 8 years ago

Absolutely - I'm currently tied up in a few pull requests/enhancements for some other projects, but the reason I've stalled on #6 is that I want to submit a WIP of something similar, preferably before cats can go 1.0.0 snapshot.

cvogt commented 8 years ago

@xeno-by is meta ready for this sorts of stuff?

xeno-by commented 8 years ago

How do you guys use the Symbol API? Do you do c.typecheck?

yilinwei commented 8 years ago

@xeno-by There are type annotations and a few c.eval here and there. Most I believe could be substituted for quasiquotes.

Is there any way to get type hierarchies using scala.meta?

xeno-by commented 8 years ago

@yilinwei Not at the moment. scala.meta 1.x only supports syntactic APIs, so no name resolution, typechecking, evaluation, etc.

I've taken a brief look at the codebase and can see that apart from syntactic APIs, you're using: 1) c.eval to evaluate arguments of annotations, 2) c.typecheck to check whether annotations like noop are really simulacrum's noop.

For potential migration to meta-based annotations (this can happen when I fix https://github.com/scalameta/paradise/issues/1, https://github.com/scalameta/paradise/issues/2 and https://github.com/scalameta/paradise/issues/3), 1) can be worked around by manual inspection of arguments and looking for literals and 2) can be temporarily replaced by comparison of names, not symbols.

xeno-by commented 8 years ago

@yilinwei I also see that you'd like to turn simulacrum into a compiler plugin. What kind of desired functionality is currently missing with macro annotations?

yilinwei commented 8 years ago

@xeno-by the issue with whitebox macros is more that the presentation compiler (and by extension) IDE's handle them so poorly.

Better IDE support would be really useful both for adoption and promoting the usability of some of these language extensions.

xeno-by commented 8 years ago

@yilinwei I see. Improving IDE support is one of the key goal of scala.meta. We'll do our best.

Btw what kind of problems do you run into with the presentation compiler?

yilinwei commented 8 years ago

@xeno-by The presentation compiler happens before any whitebox macro expansion, which basically means any methods, classes generated by simulacrum aren't picked up by IDEs.

As a compiler plugin we'd hopefully we able to promote most of it to the parser stage of the compilation, or at least enough of it such that the IDEs wouldn't have too much trouble.

xeno-by commented 8 years ago

@yilinwei Well, actually it doesn't. Expansion of macro annotations happens in the phase called namer, and namer along with typer constitute Scala's presentation compiler.

I think there's a different issue in play here. Could you provide more details about the troubles you're running into?

yilinwei commented 8 years ago

@xeno-by Interesting, I had assumed that this wasn't the case based on the behavior I was seeing and on the discussion I had with @fommil, though I possibly misinterpreted.

The problem is simply that - in all the IDE's that I've used with a macro type annotation, they don't recognize the classes/methods produced and cause the presentation compiler to underline any methods which I attempt to call as red.

For example, see here.

fommil commented 8 years ago

Yeah, I don't think this works @xeno-by :disappointed:

Something I'd really like to get ready for my scalaworld talk with Rory is a good test harness that allows plugin / macro authors to write tests for how their libraries work in the presentation compiler.

xeno-by commented 8 years ago

It would be interesting to sit together at Scala World and take a look at how exactly macros malfunction and maybe even come up with a plan how to address that.

fommil commented 8 years ago

Definitely, I'm hoping to get some people in the typelevel compiler excitement interested. Step 1 is a good test harness, as per imaginary-friend's README

fommil commented 8 years ago

I'm not going to have as much free time to work on github projects anymore, and with the impending crash I might not even be working with scala in 12 months :sob:

ceedubs commented 8 years ago

I have mixed feelings on this. On the one hand, it's nice to remove some boilerplate. On the other hand, sometimes type class method names are necessarily abstract (like combine on a semigroup) while the implementing method on a concrete structure can be more descriptive for that type (such as concat for a list). This change would incentivize using the less descriptive name on the data structure (or at least providing it as an alias, leaving people wondering what the difference in the two methods is).

Also how would this be used to implement something like def empty: A on a trait Monoid[A] type class? This isn't a method that requires an instance of the structure. In Java/OO terms it's more of a static method than an instance method.

cvogt commented 8 years ago

@ceedubs how does a type class instance implement the method of a type class trait using a more descriptive name? Can you give an example?

Hehe, I guess you answered your own question about that second one there. If "it's more of a static method", how about we look for it in the companion?

ceedubs commented 8 years ago

how does a type class instance implement the method of a type class trait using a more descriptive name? Can you give an example?

I said that data structures can often use more descriptive names. For example, consider the following Semigroup type class:

trait Semigroup[A] {
  def combine(a1: A, a2: A): A
}

Here, combine is a pretty generic and unhelpful name, but it kind of has to be due to how abstract the concept of a semigroup is.

Now say we write our own BigInt and a Semigroup instance for it.

abstract class BigInt {
  def +(other: BigInt): BigInt
}

object BigInt {
  implicit val semigroupBigInt: Semigroup[BigInt] = new Semigroup[BigInt] {
    def combine(a1: BigInt, a2: BigInt): BigInt = a1 + a2
  }
}

Should we have instead called the + method combine so that the type class instance would automatically pick it up? IMO + is a more descriptive name than combine for a BigInt (especially since valid semigroups exist for multiplication, etc). We could have put both a + and a combine method in BigInt, but I think that would be confusing -- there are lots of ways to combine BigInts.

I guess people are always free to not use this annotation, so maybe my point doesn't matter. I guess it's just that one of the great features of type classes is polymorphism without coupling the data structure and the abstraction, so I'm wary of something that entices people to more strongly couple them for the sake of brevity.

separate question

Consider the case where you were writing your own Option[A]. You would end up with an === method that would require an Equals[A] to delegate through to: def ===(other: Option[A])(implicit eqA: Equals[A]). How would this be handled? I can see a couple possibilities.

Ignore it, because it doesn't match the type signature closely enough (not very helpful)
Create something like implicit def equalsOption[A:Equals]: Equals[Option[A]] = ???.
- This is more helpful but also more complicated.
- It gets really complicated once you have a type class hierarchy. For example what if PartialOrder extends Equals and Order extends PartialOrder? I have varying levels of Option[A] support based on which of these is available. If 3 separate defs are generated in the companion object for Equals, PartialOrder, and Order, then this will result in ambiguous implicit values. A wonky hierarchy of abstract classes has to be used to prioritize the implicit values and avoid ambiguity.
- This gets even more complicated when your type class hierarchy isn't strictly linear. For example, if your structure implements both Monad and Traverse you now have an ambiguous Functor instance, since both Monad and Traverse extend Functor (but don't have a parent-child relationship with each other).

ceedubs commented 8 years ago

@cvogt sorry, I kind of failed to say that I do appreciate the idea of trying to make type classes easier to work with in scala, and I'm glad that you brought up this idea. I just think that there are subtle complexities here that need to be considered.

maxaf commented 8 years ago

It's probably worth mentioning that I've recently done some work to reduce type class boilerplate in Scala.

clhodapp commented 8 years ago

@ceedubs What about just letting you e.g. tag combine in BigInt as being part of the implementation of the typeclass, rather than a proper method of the class:

@implements[Semigroup[BigInt]]
class BigInt  {
  def +(other: BigInt): BigInt = ???
  @implementation[Semigroup[BigInt]]
  def combine(other: BigInt): BigInt = this + other
}

The idea is that this would generate:

class BigInt  {
  def +(other: BigInt): BigInt = ???
}
object BigInt {
    implicit object SemigroupBigInt extends Semigroup[BigInt] {
      def combine(self: BigInt, other: BigInt): BigInt = self + other
    }
}

I know having too many annotations sucks...

clhodapp commented 8 years ago

One major fact to keep in mind is that typeclass instances for type-parameterized classes are very often dependent on instances for their type arguments. For example... the most-reasonable Equals instance for List[T] depends on the Equals instance for T. I think any system for making it easier to define typeclass instances should have a direct way to express these sorts of constraints/dependencies.

typelevel / simulacrum

eliminate instance boiler plate: seamless migration to type classes #62

separate question