Whiley / RFCs

Request for Comment (RFC) proposals for substantial changes to the Whiley language.
3 stars 2 forks source link

Understanding Strict Subtyping #69

Open DavePearce opened 4 years ago

DavePearce commented 4 years ago

(see also https://github.com/Whiley/WhileyCompiler/issues/1003)

There are some perhaps unexpected issues arising from the current definition of strict subtyping:

Types vs Aliases

Currently, a somewhat arbitrary distinction is made between aliases and types. The former are types without invariants. The latter retain nominal information. This brings out some strange effects. For example, this doesn't compile:

type metres is (int x)
type centimetres is (int x)
type Length is metres|centimetres

function toLength(metres m) -> Length:
    return m

The key problem here is that both metres and centimetres are treated as aliases and, hence, are indistinguishable from their underlying type. This means the above program is interpreted as:

function toLength(int m) -> int|int:
    return m

We can actually fix this at the moment by redefining the two types as:

type metres is (int x) where true
type centimetres is (int x) where true

NOTE: Recursive types also have some problems here since there is no true "underlying" representation of them at this time.

Subtyping Depth

There are currently some clear challenges around how strict subtyping works. This is the most canonical example:

type nat is (int x) where x >= 0
type pos is (nat x) where x > 0

function f(pos p) -> nat|pos:
   return p

This fails because p can flow into both nat and pos --- meaning we have an ambiguous coercion. In fact, we can simplify this further:

function f(pos p) -> int|pos:
   return p

Again, this seems a bit dumb.

Runtime Type Tests

These are not currently working very well. For example, the following fails:

function f(nat|pos p) -> bool:
    if p is pos:
        return true
    else:
        return false

This seems a bit ridiculous!

DavePearce commented 4 years ago

A Coherent Model?

A key question is whether we can determine a coherent model to use as a foundation

The obvious starting points for this model are:

  1. (No Aliases). We drop entirely the concept of an alias. Then, for any nominal type N we have N <: N and N <: U where U is an underlying type of N (e.g. N is (U x) where ...). This resolves entirely the first two issues above. The golden rule is that we never expand a nominal on the left-hand side.
  2. (Simple Type Tests). Every type test must be on an expression of union type and the selector must match exactly one of the cases. This resolves the third point above.
  3. (Structural Casts). To enable flexibility in movement between types structural cases are still permitted. For example, casting nat into pos, etc.

The following programs are clearly broken by this:

type List_1 is null|{int data, List_1 next}
type List_2 is null|{int data, List_2 next}

method main(List_1 l1):
    List_2 l2 = l1