parapluu / encore

The Encore compiler.
BSD 3-Clause "New" or "Revised" License
43 stars 26 forks source link

RFC: Kappa and classes #662

Closed EliasC closed 7 years ago

EliasC commented 7 years ago

The Problem

Kappa guarantees data-race freedom. A type that can be aliased, and through which no access may cause a data-race is known as a safe type. Currently primitives, active objects, read capabilities, and e.g. tuples storing only such types are considered safe. Notably, a passive class type is never considered safe, which turns out to be a problem.

For example, in Kappa, read-only capabilities require that all accessible fields of the underlying object are immutable val-fields storing types that are safe. I ran into the following problem when writing something like this:

read trait T
  require val f : String

In the current implementation, this trait is not well-formed, as String is not a safe type, even though I know that Strings are in fact immutable. The quick and dirty solution is of course to add String to the list of safe types, but this would prevent users from adding immutable classes of their own (e.g. @kaeluka's Either class), so we should try to find a general solution.

Why is Kappa sound?

In the formalized version of Kappa, classes are nothing more than an alias for a capability. In particular, classes may not extend their own interface. This means that the String class could be implemented as

passive class String : read StringT
  data : embed char* end

where StringT contains all the methods available for a String.

This solution actually works in the current implementation, but it is annoying as you need to write new String : StringT, and use StringT everywhere you wanted a string (alternatively you could call the trait String and the class StringC). What is worse, having aliases of the class type would still allow you to mutate data without synchronization, meaning that data-races are possible (in Kappa, this is not an issue as fields may only be accessed via this).

Our solution needs to satisfy two properties:

  1. It needs a way to mark class types as safe.
  2. It needs to ensure that class types cannot be used to circumvent the type system and cause data-races, for example by allowing concurrent mutation of a field required by a read trait.

Exploration: Modes for Classes

The reason why trait types are safe while classes are not is because traits always have a mode that ensures their safety (ignoring unsafe for now). An obvious extension to this would be to let classes have modes as well:

read class String
  val data : embed char* end
  ...

This could be thought of as syntactic sugar for the trait schema outlined in Solution 0, with the additional restrictions that the modes bring (e.g. a read class can only have safe val-fields), and without the need for explicit casts between different types. It also allows simple refactoring when transforming a class into a trait:

read trait Stringy
  require val data : embed char* end
  ...

A nice thing about this solution is that it generalizes the passive/active distinction (making the passive keyword obsolete):

For any mode, it is safe to extend the interface of the class, as well as to allow external field accesses (except for active).

An unmoded class could either default to some mode (e.g. active or local), or (more satisfyingly) have its mode entirely defined by its included modes. In the latter case, care must be taken so that the class type cannot be used to cause data-races through aliases (as discussed in the the previous section).

One solution for this is to disallow unmoded classes to extend their interface freely, and to disallow external field accesses through this type (with the motivation that there is no safety mechanism specified for the class). This means that unmoded classes follow the restrictions from Kappa, except that methods from included traits may still be safely overrided. Also, new fields and methods can still be added, as long as each attribute is assigned to one of the included traits:

linear trait T
  def foo() : void
    ...

-- OK, T is extended with bar
class C : T(bar())
  def bar() : void
    ...

-- OK, bar is protected by the linearity of the class
linear class C' : T
  def bar() : void
    ...

-- Not OK, bar is not guarded by a mode
class C'' : T
  def bar() : void
    ...

Including traits

An open question is what kind of restrictions a moded class places on its included traits. Intuitively it seems reasonable that a class with mode m is required to have only traits with mode m. It would also mean that included unmoded traits are automatically given mode m.

Another option is to see m as the mode protecting the methods provided by the class, allowing traits of any mode. In this case, only a safely moded class whose included traits are all safe is considered safe. However, there are several problems with this approach:

Having special cases as the ones above makes refactoring between different usage scenarios more difficult.

Proposed solution

After exploring the design space in the previous sections, I have the following concrete suggestion:

The following changes are made to the current Encore semantics:

Additionally the following restrictions apply (analog to Kappa):

An issue that still needs solving is that of parametric polymorphism, but I leave this for a later discussion.

EliasC commented 7 years ago

In Amsterdam, me and @kikofernandez discussed a solution to this problem that consisted of marking classes as sealed. A sealed class can not extend its interface and its fields may not be read from the outside, meaning its interface is entirely defined by its included traits. This means that if the included traits are safe, so is the entire class. This influenced the design above, as unmoded classes are implicitly sealed: they may not extend their interface (modulo overriding) and may not be accessed directly from outside. The reason the keyword was dropped from the discussion is because the suggested design of moded classes also solves the problems for non-sealed classes (that now must be moded).

EliasC commented 7 years ago

With this design, sketching a class like the following would not work:

class Counter
  value : int
  def inc() : void
    this.value = this.value + 1;

Without a mode or an included trait, a class is not allowed to define fields or methods. I see two options here:

The reason I don't think we should give unmoded classes with included traits an implicit mode is that this would break the restriction that a class with mode m only includes traits of mode m. We could try to give the class an implicit mode based on the mode of the included trait, but this seems like this would be fragile and likely to break code when the included traits are changed.

kikofernandez commented 7 years ago

Hi,

I think that we should get rid of classes and introduce your moded classes (classes with a prefixed mode). If I understand correctly, your example Counter with linear mode:

linear Counter
    value: int
    def inc(): void
        this.value = this.value + 1

could be desugared into:

-- desugared code
linear trait TCounter {
    require value: int

    def inc(): void
        this.value = this.value + 1
}

Counter : linear TCounter

To me, this (linear Counter) is nicer than having to write the desugared code. In case of having multiple traits, the desugaring doesn't change. It's a just shortcut! Of course, this is how a developer can think of it, I am sure that what I call desugaring is not as easy and won't be implemented as such.

With the removal of classes and introduction of moded classes, we remove the need for a distinction between classes with modes, classes without modes, classes with modes but no traits and classes without modes but with traits. I believe it's a cleaner design.

Regarding the restrictions:

Additionally the following restrictions apply (analog to Kappa):

A read class may only have val-fields of safe type. The constructor is the only method that can write these fields.

I am not sure how performance wise it would behave, but this is as much as a restriction as is immutability to functional programming. Some people would call it, a feature!

EliasC commented 7 years ago

With the removal of classes and introduction of moded classes, we remove the need for a distinction between classes with modes, classes without modes, classes with modes but no traits and classes without modes but with traits. I believe it's a cleaner design.

This assumes that all included traits will always have the same mode, which is not the case in idiomatic Kappa:

read trait Get
  require val cnt : int
  def get() : int
    this.cnt

trait Inc
  require cnt : int
  def get() : void
    this.cnt = this.cnt + 1

class ThreadLocalCounter : read Get + local Inc
  cnt : int

class LinearCounter : read Get + linear Inc
  cnt : int

Therefore I think that we still need classes. The moded classes could be seen as syntactic sugar similar to your example, but with implicit getters and setters for the fields (which may be accessed since there is a mode).