lampepfl / dotty-feature-requests

Historical feature requests. Please create new feature requests at https://github.com/lampepfl/dotty/discussions/new?category=feature-requests
31 stars 2 forks source link

Allow Typeclasses to Declare Themselves Coherent #4

Open odersky opened 7 years ago

odersky commented 7 years ago

Motivation

Unlike Haskell, Scala allows multiple implicit instances of a type. This makes implicits cover a lot more use cases. Besides type classes, implicits can also express

When used for implementing typeclasses, implicits support different implementations of a type class in different modules. For instance, Boolean could implement Monoid with true and & in one part of a program and with false and | in another.

By contrast, Haskell's type classes have a coherence requirement which states that a type can implement a type class only in one way anywhere in a program. Coherence restricts what one can do with type classes. It feels a bit anti-modular since it establishes a condition for "a whole program" (in Scala or Java one would not even know what that is!). But it also solves two problems of Scala's unrestricted use of implicits.

The first problem is that one can accidentally mix two different implicit implementations where this is non-sensical. The classic example is a priority queue which requires an Ordered element type. If one declares priority queue like this:

class PriorityQueue[T: Ordered] { 
   ...
   def union(other: PriorityQueue[T]): PriorityQueue[T]
}

there's no way to guarantee that the two queues in a union operation agree on the ordering of type T. This can lead to nonsensical results. I believe this is a bit of a red herring however, as it is perfectly possible to design data structures that do not suffer from that problem by using some nesting (or functorizing, to borrow some ML terminology). For priority queues, Chris Okasaki's scads is a nice solution, which generalizes readily. In essence, all that's needed is to move the type parameter T one level further out:

class PriorityQueues[T: Ordered] {
  class Queue {
    ...
    def union(other: Queue): Queue
  }
}

So far, so good. But there's also a second problem which is in a sense the dual of the first. The first problem was "how do we prevent some programs from compiling when implementations are not coherent"? (i.e. we have multiple implementations of the same type class). The second problem is "how do we get some programs to compile when implementations are coherent"? In Scala, implicit resolution may lead to ambiguities, which are compile-time errors. But in a coherent world, ambiguities do not matter: picking any one of multiple applicable alternatives would give the same observable result. So if we know that all implementations of some type are coherent, we could suppress all ambiguity errors related to implicit searches of that type.

This is a real issue in practice, which mars attempts to implement in Scala typeclass hierarchies that are standard in Haskell. #2029 is a good starting point to explore the issue. Here's a somewhat more figurative explanation of the problem: Say you want to model capabilities representing driving permits as implicit values. There's a basic capability "has drivers license", call it DL. There are two more powerful capabilities "has truck driver's license" (TL) and "has cab license" (CL). These would be modelled as subtypes of the "drivers license" type. Now, take a function that requires the capabilities to drive a truck and to drive a cab, i.e. it takes two implicit parameters of type TL and CL. The body of the function needs at some point the capability of driving a car (DL). In the real world, this would be no problem since either of the function's capabilities TL and CL imply the capability DL. But in current Scala, you'd get an ambiguity, precisely because both implicit parameters of types TL and CL match the required type DL. In actual Scala code, this is expressed as follows:

trait DL
trait TL extends DL
trait CL extends DL

object Driving {
  def drive(implicit dl: DL) = ()
  def f(implicit tl: TL, cl: CL) = drive
}

Compiling this results in an ambiguity error:

-- Error: coherent.scala -------------------------------------------------------
33 |  def f(implicit tl: TL, cl: CL) = drive
   |                                        ^
   |ambiguous implicits: both value tl and value cl match type DL of parameter dl of method drive in object Driving

2029 and @adelbertc's note contain discussions of possible workarounds. One workaround uses aggregation instead of subtyping. The other tries to bundle competing implicit parameters in a single implicit in order to avoid ambiguities. Neither approach feels both simple and scalable in a modular fashion.

Proposal

The proposal is to introduce a new marker trait scala.typeclass.Coherent.

package scala.typeclass
trait Coherent

If a type T extends directly or indirectly the Coherent trait, implicit searches for instances of type T will never issue ambiguity errors. Instead, if there are several implicit values of type T (where no value is more specific than any other) an arbitrary value among them will be chosen.

For instance, the Driving example above can be made to compile by changing the definition of DL to

trait CanDrive extends scala.typeclass.Coherent

Soundness

The rule for dealing with coherence makes some implicit selections unspecified. A global coherence condition is needed to ensure that such underspecifications cannot lead to undetermined behavior of a program as its whole. Essentially, we need to ensure that if there is more than one implicit value implementing a coherent type T, all such values are indistinguishable from each other. (In)distinguishability depends on the operations performed by the program on the implicit values. For instance, if the program compares implicit values using reference equality (eq), it can always distinguish them. One good rule to follow then is to avoid calling eq, as well as all other operations defined in AnyRef on values of coherent types. Furthermore, one needs to also ensure that all members of a coherent type are implemented in the same way in all implicit instances of that type.

It's conceivable that sufficient coherence conditions can be established and checked by a tool on a project-wide basis. This is left for future exploration, however. The present proposal defers to the programmer to check that coherence conditions are met.

Mixing coherent and normal abstractions

Sometimes one has conflicting requirements whether a type should be coherent or not. Coherency avoids ambiguity errors but it also prohibits possibly useful alternative implementations of a type. Fortunately, there's a way to have your cake and eat it too: One can use two types - a non-coherent supertype together with a coherent subtype.

For instance one could have a standard Monoid trait that admits multiple implementations:

package kernel
trait Monoid[T] {
   def empty: T
   def combine(x: T, y: T): Monoid[T]
}

One can then declare a coherent subtrait:

object Canonical {
   trait Monoid[T] extends kernel.Monoid[T] with Coherent
}

One could also several coherency domains by using a class instead of an object for Canonical and instantiating that class for each domain.

Alternatives

@djspiewak has proposed a different approach to coherence based on parameterizing implicit definitions with coherency domains. #2029 contains a discussion of this approach in relation to the present proposal.

Implementation Status

The proposal is implemented in #2046.

rkuhn commented 7 years ago

This looks like a good solution for capabilities. It would be good, though, to document the factorization approach prominently such that nobody is tempted to make Ordering coherent, otherwise we’ll soon need ReverseInt or all sorts of reverseX functions on ordered collections.

alexandru commented 7 years ago

I don't believe this would be a good thing. If a type-class has a coherence requirement, I would expect the compiler to throw an error or at least a warning instead of picking arbitrarily.

We already have big problems with implicit resolution where pieces of code change behaviour depending on the order of imports. Dealing with @Coherent type-class instances would make this problem worse, especially because it's going to be an easy way to fix compiler ambiguity errors.

You folks talked of Monoid and Ordering that shouldn't come with a Coherent flag, but Applicative and Monad can have the same problem as well.

An Applicative in the context of effect triggering data types, like IO or Task, can decide how map2 and ap behaves - does it do parallelism or not? This is why in Monix (and I believe in Scalaz as well) one has to do a special import in order to get parallel execution by means of Applicative, because parallelism is expensive and isn't described by Applicative's laws - all Applicative specifies by map2 and ap is data independence, not how things get evaluated:

import monix.eval.Task.nondeterminism

// Now it does parallel execution
Applicative[Task].map2(task1, task2)(_ + _)

But if Applicative ends up being Coherent, well, this will no longer work, because the compiler will be totally silent about conflicts. To make it worse, the implementation of mapBoth that powers this parallel Applicative is actually pretty complicated, since it has to deal with synchronisation in a multi-threaded environment. I can see the need for users to override how this operation behaves for their own purposes, which I thought is one of the perks of going with type-class based polymorphism in Scala.

One proposed alternative, mentioned by @adelbertc, doesn't get enough attention: not using subtype relationships, but use composition instead. I first saw this idea in Scato by @aloiscochard, which is the design that's being pushed in Scalaz 8. I have also used a variation of it in the type-class hierarchy described in monix-types, which can be seen in action in a stream-like type, a work in progress making use of higher-kinded polymorphism (see implementation).

Of course, there are legitimate concerns against this approach. One of them is that you'd have to explicitly import the relationships in scope:

import monix.types._
import monix.types.syntax._

def eval[F[_], A](a: => A)(implicit F: Monad[F]): F[A] = {
  import F.{applicative => A}
  A.pure(()).flatMap(_ => A.pure(a))
}

def filteredEvalSomething[F[_],A](a: => A, f: A => Boolean)
  (implicit MF: MonadFilter[F], ME: MonadError[F,Throwable]) = {

  import MF.{applicative => A, monad => M}
  A.pure(())
    .flatMap(_ => try A.pure(a) catch { case NonFatal(ex) => ME.raiseError(ex) })  
    .filter(f)
}

It's annoying for sure, but then working with higher-kinded polymorphism is annoying too. In my opinion this proposal would make MTL-style code easier but at the expense of use-cases that Scala already handles well.

djspiewak commented 7 years ago

I don't believe this would be a good thing. If a type-class has a coherence requirement, I would expect the compiler to throw an error or at least a warning instead of picking arbitrarily.

If it does this, then MTL is impossible again. Motivating example:

def foo[F[_]](implicit MS: MonadState[F, String], MO: MonadOption[F]) =
  Monad[F] pure 42

There's a clear ambiguity here. The goal is to get rid of the compile error, not add more of them.

An Applicative in the context of effect triggering data types, like IO or Task, can decide how map2 and ap behaves - does it do parallelism or not? This is why in Monix (and I believe in Scalaz as well) one has to do a special import in order to get parallel execution by means of Applicative, because parallelism is expensive and isn't described by Applicative's laws - all Applicative specifies by map2 and ap is data independence, not how things get evaluated

Whether or not parallel Applicative is valid is a longer discussion with a lot of nuance, and I subjectively don't think this is a valid application of incoherence. But it's clearly an application of incoherence, and one which should be resolvable in the same way that Monoid is. I'm not sure I see why that encoding doesn't apply.

One proposed alternative, mentioned by @adelbertc, doesn't get enough attention: not using subtype relationships, but use composition instead.

There's a lot of problems with Scalaz 8's encoding. It solves many, many problems, but the encoding is quite bulky, the type errors are unclear, and as you said it comes with other use-site problems which are even more annoying. Also, fundamentally, it doesn't really solve the problem of incoherence in so much as it encodes a type level semi-lattice which may be used to resolve conflicts.

puffnfresh commented 7 years ago

It's not just MTL, this happens with simple stuff like:

trait Functor[F[_]]
trait Bind[F[_]] extends Functor[F]
trait Apply[F[_]] extends Functor[F]
trait Foldable[F[_]]
trait Traverse[F[_]] extends Foldable[F] with Functor[F]

case class Id[A](a: A)

object Id {
  implicit val IdTraverse: Traverse[Id] = new Traverse[Id] {}
  implicit val IdApply: Apply[Id] = new Apply[Id] {}
}

object Example {
  def example[F[_]](implicit T: Traverse[F], A: Apply[F]): Int = {
    println(implicitly[Functor[F]])
    0
  }

  example[Id]
}

Adding the Coherent typeclass to Functor allows the above to compile. That's very useful.

It doesn't solve all of the problems with coherency, it's still possible to pass dictionaries explicitly. It's still possible to make a higher priority implicit:

trait Default[A] { def value: A }
object Default {
  implicit val IntDefault: Default[Int] = new Default[Int] {
    val value = 0
  }
}

object Example {
  implicit val MyDefaultInt: Default[Int] = new Default[Int] {
    val value = 1
  }

  println(implicitly[Default[Int]].value) // prints 1
}

So I wouldn't call this the "Coherent typeclass" but instead the "ignore ambiguities annotation" - this is useful, I would use it :+1:

Using implicits for things like changing evaluation order causes problems with refactoring, which is the usual argument for coherency. The best way to handle that situation is to create a new type.

:shipit:

djspiewak commented 7 years ago

So I wouldn't call this the "Coherent typeclass" but instead the "ignore ambiguities annotation" - this is useful, I would use it 👍

Absent a more involved language revision to allow constructively coherent definitions (Miles has some ideas on this front), I think this is by far the best approach we could have. And I would very much use this as well.

Using implicits for things like changing evaluation order causes problems with refactoring, which is the usual argument for coherency. The best way to handle that situation is to create a new type.

I'm generally not in favor of newtyping just to get different implicit semantics, but I 100% agree that it's the most appropriate solution here. (edit: where by "here" I mean w.r.t. the Applicative example)

adelbertc commented 7 years ago

There's a lot of problems with Scalaz 8's encoding. It solves many, many problems, but the encoding is quite bulky, the type errors are unclear, and as you said it comes with other use-site problems which are even more annoying. Also, fundamentally, it doesn't really solve the problem of incoherence in so much as it encodes a type level semi-lattice which may be used to resolve conflicts.

It's certainly bulky, but only because it has to work with what is currently given in Scala. If you imagine a hypothetical language extension, similar to what we have here with Coherent, maybe we can come up with another proposal that achieves the same thing modulo the syntactic bulk. Something that gets you the said semi-lattice of type classes.

Though perhaps this is what Coherent solves.

nafg commented 7 years ago

I'm guessing the plan is that in the coherent case, where it will not error on ambiguous Implicits, instead there will be some well-defined (if objectively arbitrary) and specified set of rules used to select the implicit?

If yes, that might mitigate the "what if I want to override an implicit that a library shortsightedly marked coherent" concern...

adelbertc commented 7 years ago

One concern I have about this approach is it seems to be an ad-hoc mechanism nailed on top of the existing implicit resolver. Maybe such a concern is ill-founded, this is just my intuition speaking at this point.

twanvanderschoot commented 7 years ago

Though it seems an interesting and promising concept, relying on a programmer's discipline to check the conditions for coherence is not scalable; managing programmer team discipline over organisational change, scope an time is extremely difficult. Without embedding this check into the compiler, I strongly believe this will increase the problems with implicits even more!

I inherited a Scala code base of about 5 years old with several "historical" layers of maturity. The biggest issue is the abuse of implicits combined with an immature type hierarchy (too deep, too much generic types) , effectively making it a "go to". That is, when locally inspecting code, it is very difficult to find the actual conversion or implicit used: It has become the new global.

Nor the IntelliJ Scala plugin or the Scala Eclipse version are able to offer reliable support; they can't handle the depth of the implied type proofs and either halt or crash or misidentify the culprit. Only compiling all code can "prove" what happens.

I do see a use for implicits, but it is too easy to abuse them. Therefore any proposal extending their use not accompanied with compiler support enforcing the rule of application will invariably incur heaps of technical debt.

ritschwumm commented 7 years ago

i recently run into problems using cats where i imported the same syntax extension (via an implicit class) over two different inheritance paths and had a hard time figuring out why the compiler told me the extension method does not exist.

i think those two problems are related: the compiler does not know two things (a parent typeclass or an extension) are actually the same, codewise. it might make more sense to attack the problem at the point where the confusion occurs instead of where it originates from: in the functor/traversable/monad example it might make more sense to attach the information that they are inheriting "the same" functor to travesable and monad, instead telling the compiler that whereever the functor occurs it's probably the same one.

caente commented 7 years ago

perhaps this has been discussed before, and discarded, http://docs.scala-lang.org/sips/pending/trait-parameters.html, but what about using implicit arguments on the trait to achieve composition of typeclasses?

trait DL
trait TL(implicit dl : DL)
trait CL(implicit dl : DL)

object Driving {
  def drive(implicit dl: DL) = ()
  def f(implicit tl: TL, cl: CL) = drive // you'll have to pass dl "explicitly" as an implicit, if you don't the compiler will ask for it
}

so, instead of

trait Bind[F[_]] extends Functor[F]
trait Apply[F[_]] extends Functor[F]

we could do

trait Bind[F[_] : Functor]{
  // now F can use fmap
}

trait Apply[F[_] : Functor]{
  // now F can use fmap
}

but this means that at the call site we would need to do this:

def foo[F](implicit b:Bind[F], a:Apply[F], f:Functor[F])

but also means that there will be no ambiguity as what Functor implementation is being used

the only downside I see, besides that I don't know what would take to make this happen, is that we would need to provide implicits for all the typeclasses required, but in a way I think it adds to clarity

odersky commented 7 years ago

@twanvanderschoot

Though it seems an interesting and promising concept, relying on a programmer's discipline to check the conditions for coherence is not scalable; managing programmer team discipline over organisational change, scope an time is extremely difficult. Without embedding this check into the compiler, I strongly believe this will increase the problems with implicits even more!

I agree it is a worry. Technically the check cannot be put into the compiler, since a compiler has no knowledge what program is. Maybe we can put it into a build tool. But that's a long way off, for sure.

I inherited a Scala code base of about 5 years old with several "historical" layers of maturity. The biggest issue is the abuse of implicits combined with an immature type hierarchy (too deep, too much generic types) , effectively making it a "go to". That is, when locally inspecting code, it is very difficult to find the actual conversion or implicit used: It has become the new global.

It would be interesting to know more about this. Implicit parameters are the core of Scala. We cannot simply suppress them and the recent addition of implicit function types has made them even more powerful than before. Implicit conversions we could and probably should restrict further. So, to make progress in the field it would be really important to learn specifics of past mistakes. This would help us develop patterns or maybe even compiler-checked rules that avoid similar mistakes in the future.

Atry commented 7 years ago

Looks like shapeless's cachedImplicit

DavidGregory084 commented 7 years ago

It would be interesting to know more about this. Implicit parameters are the core of Scala. We cannot simply suppress them and the recent addition of implicit function types has made them even more powerful than before. Implicit conversions we could and probably should restrict further. So, to make progress in the field it would be really important to learn specifics of past mistakes. This would help us develop patterns or maybe even compiler-checked rules that avoid similar mistakes in the future.

@odersky @twanvanderschoot

I think you are right that implicits are at the core of the Scala language, but implicit resolution is definitely an area in which the compiler could provide more help - this does not necessarily have to be in the form of static checks.

Consider that the tool that scalac offers currently is the incredibly verbose output of -Xlog-implicits, which cannot be targeted to explain a specific compilation error and which was written primarily for those with a passing knowledge of the compiler internals (for obvious reasons).

Perhaps compiler plugins like tek/splain and Scala Clippy could be used as inspiration for more concise and user-friendly output in Dotty, or even a slightly more graphical output in keeping with the beautiful new compilation error messages?

However, since this is veering quite a long way off topic I think it would probably be better to discuss this further on https://contributors.scala-lang.org if there is some interest in doing so?

odersky commented 7 years ago

Yes, let's take discussion about implicit best practices to https://contributors.scala-lang.org. It's best to keep this issue focussed on coherence.

odersky commented 7 years ago

What I took away from the very good discussion here (and also on the Scala reddit) is that an unchecked Coherent is probably just too tempting to use as a quick fix for ambiguity problems. So it would likely be abused.

I now think that we should try to solve the question first how to check that usage of a typeclass is actually coherent. To be decidable by the regular compiler, the check has to be quite restrictive. For instance, we'd almost certainly need a rule that implicit definitions that make a type T an instance of the coherent type class C have to be defined in the same compilation unit as either T or C. But I think a restrictive verification of coherence is better than opening the floodgates and let every type class be coherent by declaration.

smarter commented 7 years ago

For instance, we'd almost certainly need a rule that implicit definitions that make a type T an instance of the coherent type class C have to be defined in the same compilation unit as either T or C.

I like this. It's very reminiscent of GHC orphan instance warning (See https://wiki.haskell.org/Orphan_instance, https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/separate_compilation.html#orphan-modules-and-instance-declarations)

alexandru commented 7 years ago

What I took away from the very good discussion here (and also on the Scala reddit) is that an unchecked Coherent is probably just too tempting to use as a quick fix for ambiguity problems. So it would likely be abused.

That's my primary concern expressed in my comment above.

For instance, we'd almost certainly need a rule that implicit definitions that make a type T an instance of the coherent type class C have to be defined in the same compilation unit as either T or C.

I also like this rule very much.

I can think of problems with such restrictions though, for example the reason for why I have a monix-types in my project, along with why @djspiewak's Shims project exists, is because in the community some of us like to provide compatibility for both Cats and Scalaz 😕

So what happens is that if you have defined a shims.Monad[T], it can be converted to either a scalaz.Monad[T] or to a cats.Monad[T]. And people have fine grained control over the dependencies, so you don't necessarily have to import Scalaz or Cats into your classpath, if you don't want it. But with such a restriction then I don't think such conversions work anymore. And dependencies on Cats and Scalaz can no longer be optional.

Of course, using shims is far from ideal, it sucks actually and this isn't a problem that Haskell has AFAIK. Just mentioning this for awareness.

djspiewak commented 7 years ago

@odersky A restriction to only allowing coherence when the instances are defined within the same compilation unit is going to make something like this almost impractically narrowly applicable. The whole point of typeclasses is that they're an open extension mechanism, but that benefit is completely lost if you're trying to achieve same-unit coherence at the same time. In practice, I'm relatively confident that even frameworks which highly value coherence (like Scalaz or Cats) would simply forego that potential benefit, since the costs would be too high.

I agree that a system which implements actual checks for coherence, rather than functioning as an assertion, would be strongly preferable. But it's important that such a system doesn't come at the expense of all of the other benefits of typeclasses.

Edit: I misread the proposal. See below.

smarter commented 7 years ago

@djspiewak I think "impractically narrowly applicable" is a bit of an exaggeration, as far as I know, it's considered good practice in the Haskell community to avoid orphan instances.

djspiewak commented 7 years ago

@smarter Rereading @odersky's proposal, I see that it suggests it might be in the compilation units for either T or C, which is less restrictive than it looked on my phone this morning. That would be a reasonable restriction in most cases, though it would prevent things like a checked-coherent Monad[Option], which seems unfortunate. It would at least allow most common cases.

smarter commented 7 years ago

though it would prevent things like a checked-coherent Monad[Option], which seems unfortunate

Why couldn't you have the Monad[Option] instance be defined where Monad is defined?

djspiewak commented 7 years ago

You could, but that would force users to bring that instance in, whether they want it or not. That might be a tradeoff that frameworks would make, but Scalaz decided explicitly against doing that back when Scalaz 7 was designed (Scalaz 6 used to bring in stdlib instances by default), and Cats has held to that design for largely the same reasons.

smarter commented 7 years ago

If it's in the same compilation unit but in a separate object, it wouldn't force users to bring the instance in, right?

djspiewak commented 7 years ago

@smarter Actually even better, I suppose it could be defined as a trait which is later mixed into objects elsewhere (e.g. scalaz.std.option). So ultimately, due to inheritance, it only compromises file organization. Which is reasonable.

Actually for that very reason, it sort of surprises me that same-compilation unit would be a sufficient restriction.

smarter commented 7 years ago

Good point, maybe besides same-compilation unit we also need to require the typeclass instance to be defined in a sealed/final class or an object?

djspiewak commented 7 years ago

Good point, maybe besides same-compilation unit we also need to require the typeclass instance to be defined in a sealed/final class or an object?

That would be annoying, but wouldn't compromise any of the things Scalaz/Cats want to do with their import organization. In fact, we might actually get to see more use of the package foo { ... } construct, as a way of reorganizing the namespacing of objects stored in the same compilation unit solely to achieve coherence.

adelbertc commented 7 years ago

I think a more convincing reason we don't define instances in the companion object of the type class specifically (defining it in the object for the data type is fine) is if we put the Monad instance for Option in object Monad it would not resolve if we only asked for say, Functor.

This generalizes to defining instances in the companion object of a type class and asking for the instance of a super class.

On Mar 7, 2017 10:32, "Guillaume Martres" notifications@github.com wrote:

Good point, maybe besides same-compilation unit we also need to require the typeclass instance to be defined in a sealed/final class or an object?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lampepfl/dotty/issues/2047#issuecomment-284813571, or mute the thread https://github.com/notifications/unsubscribe-auth/ABRW9FGRdOScDG4JGSMNyl1K-rFlwQSdks5rjaLOgaJpZM4MSG3t .

DarkDimius commented 7 years ago

@twanvanderschoot

relying on a programmer's discipline to check the conditions for coherence is not scalable

I agree. There is an alternative version of this proposal that moves the coherency decision from definition side of the class to the use-site. This has an advantage of not polluting the entire code-base if programmers aren't disciplined enough, as it will only harm a single method.

This can be achived by defining a magical class Coherent[T](u:T) extends AnyVal that typer will be aware of. in case implicit search looks for a Coherent[T] it means that any value would suffice. Being a value class, it will be erazed to underlying implicit type. Eg, in the example below if you want to build a red-black tree and you don't care what kind of ordering you get, you can write like this

def makeRBTree[T](implicit a: Coherent[Ordering[T]])

I think this alternative provides a safer mechanism to be used in big code-bases. It will be slightly harder to implement due to more magic involved inside compiler

djspiewak commented 7 years ago

@DarkDimius I had an idea very similar to this a few months ago. It's actually implementable on current scalac, but it does of course require some significant use of dark magic to manipulate the compiler's internal processes via reflection in a macro, but it is possible. I don't really like it as much as the declaration-site proposals though. And particularly if it's possible to have a checked coherence with just same-file restrictions.

aloiscochard commented 7 years ago

@smarter this is done like that in scalaz8 (as the problem outlined by @adelbertc is worked around with the encoding).

From what I can recall, the reason it was remove from 6 to 7 is because folks where having terrible compilation time during implicits resolution (due to the poor quality of the algorithms being involved).

I heard the situation has much improved since then (in the compiler), which motivated also the trade-off applied in 8. Plus, you don't have an insane import strategy to remember when working with the std-lib.

TwanAurum commented 7 years ago

@odersky I've to admit that I don't know enough of the Scala compilation process to insist that the coherence check should be enforced by the Scala compiler. So, I stand corrected. I believe what I meant to say was that there should be an automated and enforced step in the build process performing such a check. Automated to, hopefully, achieve sufficient coverage of the code base's types and enforced to deny less principled team members the freedom adding code mucking up the consistent state of affairs.

I'm happy to share with you our experience with "abuse" of implicits. We can't share our code base, so our team is preparing some "sanitised" (strange wording in this context ;-) examples of abuse we encountered in the code base. Please understand that "abuse" is a normative judgement. So, we also need to present our rules and their rationale so you can fully understand why we judge something as an abuse. I will present that summarised case study, as per your suggestion on https://contributors.scala-lang.org

odersky commented 7 years ago

@TwanAurum

our team is preparing some "sanitised" (strange wording in this context ;-) examples of abuse we encountered in the code base.

This could be extremely valuable. Thank you for doing this!

vladap commented 7 years ago

This silent selection by compiler is very dangerous. I think it expects too much from developers. And if developer makes a mistake in the process it will manifest as a runtime puzzler.

This is one example http://scalapuzzlers.com/#pzzlr-054 It seems unrelated but I can't help myself and I see quite a similarity. While I know that you talk about something else here, in a sense I could say that compiler thinks that implicit val and implicit def are coherent and that it means that implicit val can be always selected.

adelbertc commented 7 years ago

I would like to bring up @djspiewak 's coherence domains proposal again. I think my issues with something like a Coherent marker trait is it is too bolted on - there is an existing implicit resolver, and then we've tacked on this magical trait that suddenly changes how the implicit resolver works.

From a language cleanliness perspective I think it'd be better to have an implicit resolution mechanism that naturally enables type class coherence. Coherence domains seems to be a step in the right direction and would be interesting to explore more in that direction.

Blaisorblade commented 7 years ago

@adelbertc Thanks for the link! That looks a very interesting step — I've added a comment to @djspiewak's proposal: it evaluates itself over the Set problem, but not over modularity questions. It would be good to consider those as well.

Logistically: IIUC gist have no automatic notifications for comments.

odersky commented 7 years ago

@adelbertc

I would like to bring up @djspiewak 's coherence domains proposal again. I think my issues with something like a Coherent marker trait is it is too bolted on - there is an existing implicit resolver, and then we've tacked on this magical trait that suddenly changes how the implicit resolver works.

I am confused. Do you suggest there should be several resolvers? If that's the case it would be about as diametrically opposed to the design philosophy of Scala as one could make it.

I think the characterization as "bolted on", or "magical trait that suddenly changes how the implicit resolver works" is not really fair. The implicit resolver always works the same way. One of its possible outcomes is an ambiguity. If we can ascertain by other means that the ambiguity does not matter, we can ignore it, otherwise it's an error. That's as far as this proposal goes.

adelbertc commented 7 years ago

@odersky

I am confused. Do you suggest there should be several resolvers?

Definitely not proposing that :-) I am saying I would prefer if whatever approach we ended up with did not special-case the implicit resolver as it feels like the Coherent marker trait is doing, but instead had the implicit resolver naturally enable type class coherence.

I confess I am not a PL designer nor have I even written a toy PL, but a look over Daniel's coherence domain proposal certainly seems interesting. Implicits could behave as before, but the additional scoping of the implicits could lend itself to be useful both in the case of coherence and in generic implicit use. That being said I would be interested in any concerns you have over his approach, whether or not it applies to where you envision Dotty going.

odersky commented 7 years ago

@adelbertc I guess my concern is to keep extensions simple and minimal. In the end, it comes down to knowing when we can safely pick an arbitrary result from multiple results returned by the resolver.

Coherence domains solve the problem that the same underlying type might be used coherently in one context and incoherently in another. I have shown that simple path-dependent types achieve the same distinction, so it's unclear whether the added syntax and mechanism is needed.

But both coherence domains and the present proposal will need a way to make sure that implicit implementations of a type are indeed coherent. That's where I expect the real difficulty to lie.

odersky commented 7 years ago

I have tried to come up with a semi-formal requirement what coherence means, which would be checkable by a compiler. Here's my current strawman attempt to check coherence of definitions.

If class C is coherent:

require (C does not have type parameters) 
     || (C has a single non-variant type parameter)

require (C does not have value parameters)

if C does not have type parameters:
    for M <- member of C
    do  require (M is final)

if C has a single type parameter:
    for M <- member of C
    do  require (C is abstract) || (C is final)

def instance(C: Class, B: Class): Type = 
  "the type argument of B when inherited from C"
  C.typeRef.baseTypeWithArgs(B).argType(0)

def norm(T: Type) = 
   "T where every occurrence of a type variable or abstract type X is replaced by
    a fresh, instantiatable type variable Y. If the bounds of X are S..U 
    then the bounds of Y are norm(S)..norm(U)."

def disjoint(T: Type, U: Type) = 
   !(norm(T) <: norm(U) || norm(U) <: norm(T))

for B <- type parameterized coherent base classes of C
    T = instance(C, B)
    if T is a class type
do  require (C is defined in same compilation unit as B or T)

for CM <- concrete methods defined in C
    AM <- abstract methods overridden by CM
    B = AM.owner
    if B is coherent
    if instance(C, B) is a class type
do
    val Competing = 
        for D <- subclasses of B in compilation units of B and C
            CM1 <- concrete methods implementing AM in D
        yield (CM1, D)

    for (CM1, D1) <- Competing
        (CM2, D2) <- Competing
    do  require (CM1 == CM2) || disjoint(instance(D1, B), instance(D2, B))

Note: When I say "class" in the above I always mean "class or trait". It would be good to have a common name that would cover both.

The restrictions above should ensure the following:

This is not enough to ensure that choosing between implicit values of a coherent type does not make any difference for the program's behavior as a whole, though. To obtain this stronger property, we'd need to enforce additional conditions which are essentially related to parametricity. Namely, the following operations are prohibited for implicit values of coherent type:

The problem is that this condition needs to be applied to runtime values, but checking it at runtime is impractical. If we want to ensure this condition in a static type system we'd have to impose in addition the following widening restrictions:

This is quite drastic. It means for instance that we will no longer be able to apply a generic function like identity to values of coherent types. The restrictions do correspond quite closely to the situation in Haskell, though, where type classes are not types, and type class dictionaries are not first class values.

One possible solution could be to make failure of the definition and parametricity checks errors, whereas widening violations would just be warnings.

djspiewak commented 7 years ago

I have tried to come up with a semi-formal requirement what coherence means, which would be checkable by a compiler. Here's my current strawman attempt to check coherence of definitions.

Digesting your pseudocode. Will have more thoughts later.

Note: When I say "class" in the above I always mean "class or trait". It would be good to have a common name that would cover both.

In practice, I always consider the word "class" to refer to "class or trait", especially where typeclasses are concerned, since there is no practical functionality enabled by class which is not enabled by trait in that context. Especially in Dotty.

To obtain this stronger property, we'd need to enforce additional conditions which are essentially related to parametricity. Namely, the following operations are prohibited for implicit values of coherent type: [snip]

YES! I had forgotten about the parametric effects of allowing coherent typeclasses to be treated as values, but you're absolutely right this causes problems. I have a somewhat drastic proposal which may resolve this issue: Coherent <: Any is false. In other words, Coherent would represent a hierarchy of values which is incompatible with normal values, and would not implement the same operations or be usable in normal contexts.

At first glance, this seems terrifyingly weird. After all, it would introduce a new top type, effectively a "parallel universe of values". But I think in a lot of ways, it actually resolves a lot of the concerns quite nicely, without requiring complicated one-off runtime checks. It would obviously be something that people could circumvent using reflection, bytecode manipulation, and other terrible things, but I'm rather uninterested in trying to make those things safe when people choose to do them.

smarter commented 7 years ago

After all, it would introduce a new top type, effectively a "parallel universe of values"

Note that this is also something being explored in a completely different context for phantom classes: https://github.com/lampepfl/dotty/pull/1408 Though here, rather than a parallel universe, couldn't you define Any <: ParametricAny (or some less ugly naming scheme) where no methods are defined on ParametricAny? Surely this kind of thing must have been proposed and debated before.

djspiewak commented 7 years ago

Though here, rather than a parallel universe, couldn't you define Any <: ParametricAny (or some less ugly naming scheme) where no methods are defined on ParametricAny? Surely this kind of thing must have been proposed and debated before.

I definitely prefer that to Coherent being a brand new, parallel Top. But I think my preference for that is mostly subjective, since I don't see a ton of practical value in it.

Blaisorblade commented 7 years ago

Though here, rather than a parallel universe, couldn't you define Any <: ParametricAny (or some less ugly naming scheme) where no methods are defined on ParametricAny? Surely this kind of thing must have been proposed and debated before.

I definitely prefer that to Coherent being a brand new, parallel Top. But I think my preference for that is mostly subjective, since I don't see a ton of practical value in it.

A parallel top would be a big negative. If you have existing parametric code in the wild, such as Scalaz, Cats, and many Typelevel libraries, it would be a pity to have to copy-and-paste it just to use it on ParametricAny. You'd have to duplicate each line.

Regarding ParametricAny (under different guises): Paul Phillips has proposed it, both Typelevel and Scalaz have using it as a "honor convention" (see Scalazzi, also discussed here: https://github.com/typelevel/scala/issues/40), it would make some additional optimizations sound (see https://github.com/typelevel/scala/issues/39).

It also has other benefits for typeclasses, it would enable (in perspective) type classes having working type members (that is, an encoding of Haskell associated types), which in turn requires "applicative functors" (in the sense used in OCaml). In particular, try ensuring that two occurrences of implicitly[FooTypeClass[T]].memberType in the same context are type-compatible, especially when implicit search calls some implicit method implicit def fooTypeClass[T]: FooTypeClass[T]. This requires applicative functors. But you'd also want the two implicit instances to be indistinguishable, even though the two invocations of fooTypeClass will give different typeclass instances.

val a1 : implicitly[FooTypeClass[T]].memberType = ....
val a2 : implicitly[FooTypeClass[T]].memberType = a1

I understand we're not discussing applicative functors. I'm just saying, having ParametricAny would be good also because it enables other potentially sensible extensions.

djspiewak commented 7 years ago

Regarding ParametricAny (under different guises): Paul Phillips has proposed it, both Typelevel and Scalaz have using it as a "honor convention" (see Scalazzi, also discussed here: typelevel/scala#40), it would make some additional optimizations sound (see typelevel/scala#39).

Just to throw some better names on the idea:

Top >: {Any >: {AnyRef >: Null, AnyVal} >: Nothing, Coherent} >: Bottom

Or in ASCII Art form:

       Top
      /   \
     /     \
    /       \
Coherent    Any
   |       /   \
   |      /     \
   |  AnyVal   AnyRef
   |    |        |
   *    *        *
   |    |        |
   |    |       Null
   |     \      /
   |      \    /
   |     Nothing
    \      /
     \    /
      \  /
     Bottom

We might need another type in there along side Coherent, but you get the idea.

djspiewak commented 7 years ago

You need s/>:/<:/g I guess?

I don't think so? AnyRef <: Any <: Top is what I'm proposing, I just wrote it left-to-right.

Blaisorblade commented 7 years ago

@djspiewak I need more coffee and can't even tell — sorry for the noise, I just removed it.

odersky commented 7 years ago

I like the idea of ParametricAny/Top a lot. One tricky issue is how to do this and maintain backwards compatibility. In particular, what should the type parameter [T] expand to? Most logical would be [T <: Top], but we can't simply do that because then all code that pattern matches on values of type T or that calls a method from Any would not longer compile. On the other hand, if we assume [T] expands to [T <: Any] as usual, Coherent values still could not be passed to (say) Predef.identity unless that one was changed to

def identity[T <: Top](x: T): T

So a lot of signatures would have to be uglified to make them more useful.

Maybe a very smart rewrite tool could help us out here. Could we selectively rewrite [T] to [T <: Any] if the class or method which defines the type parameter acts on T in a non-parametric way? The tricky bit here are calls to other methods, which will require a fixed point iteration. But it should be doable in principle. /cc @olafurpg .

odersky commented 7 years ago

Also, if the rewrite tool idea pans out, we should reconsider naming. For familiarity and symmetry with Nothing it would be good if we kept Any as the top type. That means we need a subtype of Any for Any with methods - maybe AnyObj? This would lead to the following diagram

       Any
        |
      AnyObj
     /     \
AnyVal      AnyRef
odersky commented 7 years ago

One other thought on migration. Maybe dotty could automatically cast Any to AnyObj but issue a warning every time it does this. Then everyone could refactor to eliminate the warnings at their own pace.