HigherOrderCO / Bend

A massively parallel, high-level programming language
https://higherorderco.com
Apache License 2.0
16.72k stars 412 forks source link

Add progressive typing to Bend #615

Open developedby opened 4 days ago

developedby commented 4 days ago

Is your feature request related to a problem? Please describe. Bend already assumes by the semantics of ADTs and matches that users are writing programs within the contraints of some kind of polymorphic lambda calculus type system.

This is not enforced anywhere since bend is currently untyped. This allows very easily writing very clever programs that escape a simple type system, but in most cases just leads to writing incorrect programs. It'd be better if there was some way of enforcing it.

Having it be untyped like it currently is makes Bend basically unusable for anything not super tiny.

Describe the solution you'd like Add optional hindler-milner typing to Bend. Types are created by adt defs.

Needs to have support for axiomatic type statements to allow unscoped variables, superpositions and native hvm definitions to be used with typed code.

Untyped functions have type Any, meaning that they are trusted, like in typescript or mypy.

For Imp syntax, a syntax like typehints in python could work well

def and3(a: Bool, b: Bool, c: Bool) -> Bool:
  x1: Bool = and(a, b)
  return and(x1, c)

For Fun syntax, we can have separate function type declaration like in haskell for example.

# This is hard to parse, but I like how it looks similar to the other one
(and3 (a: Bool) (b: Bool) (c: Bool)) -> Bool 
(and3 a b c) =
  # Parens are not necessary here
  let (x1: Bool) = (and2 a b);
  (and2 x1 c)

# This is easier to parse, but i don't like how it looks
and3: Bool -> Bool -> Bool -> Bool
and3 a b c =
  # We could also have annotation terms instead
  # Parens are also not necessary here either
  let x1 = ((and2 a b) : Bool);
  (and2 x1 c)

We can compile to Kind to typecheck. That would need some trickery to deal with pairs, sups and unscopeds, but should be possible. Functions that have unscopeds need to be untyped or typed with an axiom.

This leaves a couple things still open: What syntax to use for type variables. I like these two options

def List/reverse<t>(list: List(t)) -> List(t):

def List/reverse(~t, list: List(t)) -> List(t):

What syntax to use for unchecked type statements. For example we could use :: instead of -> in Imp syntax for an unchecked type.

# Works, but a bit too subtle
def channel<a, b>() :: (a -> b, b -> a):
  return (lambda $a: $b, lambda $b: $a)

# Using '->' for type and a prefix marker for unchecked type
def !channel<a, b> -> (a -> b, b -> a):
def unchecked channel<a, b> -> (a -> b, b -> a):
...

# An attribute?
@unchecked
def channel<a, b>() -> (a -> b, b -> a):

# something else?

I can't think of a use for it in expressions (except as from the desugaring of functions) but we could use :: instead of : there as well if this is wanted. Initially I think we shouldn't because it incentivizes users destroying types when a type error appears.

What syntax to use for foralls. Functions with parametric types compile to lambdas with forall for types

def id<t>(x: t) -> t:
  return x

# becomes
id = (@x x : forall t t -> t)

Describe alternatives you've considered Leaving typing to another language. That would go against our current direction of making Bend the language people use.

Using a dependent type system. While personally I'd love to see that, if values cross into the types then our type checker necessarily needs to know how to evaluate all the hvm quirks. Basically it would restrict the type checker to HVM itself or an implementation of HVM in the type checker. That would be annoying and could potentially be catastrophic if the type checker has the same limitations as HVM.

Using another kind of polymorphic lambda calculus, like system F or system F_omega. I think for users of imperative languages hindley milner is easier to understand and write.

edusporto commented 4 days ago

For Fun syntax, we can have separate function type declaration like in haskell for example.

Kind2 used to have function type declarations within their definitions, so something like this:

(and3 (a: Bool) (b: Bool) (c: Bool)): Bool =
  ...

Considering gradual typing, I think this also works better than separating it since we could give types only to the variables we want:

(and3 (a: Bool) b c): Bool =
  ...

What syntax to use for type variables.

Regarding generics, Kind2 does it with the diamond operator following the name of the function being defined. For example:

concat <T> (xs: (List T)) (ys: (List T)) : (List T) =
  match xs {
    ++: (cons xs.head (concat xs.tail ys))
    []: ys
  }

It's also similar to other existing languages, so I think it's nice.

With Hindley-Milner, it would be possible to automatically "float" any undefined names to type quantifiers, but I personally think being explicit is better.

I agree with what Kate said on the Discord channel - passing ~t to a function as a normal parameter makes it look like types live in the same domain as values and would be too much like dependently-typed languages, so I prefer the other option

What syntax to use for foralls

I'll look into this more later, but if we use a HM style type checker, it could be good to do something similar to the existing literature.

developedby commented 3 days ago

Kind2 used to have function type declarations within their definitions, so something like this:

That doesn't allow for pattern matching. If we want to allow it, then it was

Foo (a: T1) (b: T2) : T3
Foo (CtrA1) b = ...
Foo (CtrA2) b = ...

It's a good option

developedby commented 3 days ago

For native hvm definitions we can give them a type with the imp syntax

hvm to_u24 -> forall t. t -> u24:
  ...
developedby commented 3 days ago

@edusporto @imaqtkatt @LunaAmora I'm also considering using :: instead of : for imp functions.

def fun <a, b> (x: b, y: b) :: a -> b:
  ...

This deviates from the python syntax but i think it'll get too hard to read functions that return functions

def fun <a, b> (x: b, y: b) -> a -> b:

# must add parens to be readable
def fun <a, b> (x: b, y: b) -> (a -> b) :
developedby commented 2 days ago

We could also use haskell style for fun syntax, with implicit type variables

fun :: b -> b -> (a -> b)
fun x y = ...
developedby commented 2 days ago

Ok here's what I settled on.

First version has types only on functions. Possibly will need to add annotation expressions later to mix typed and untyped code.

For imp syntax

# <var1, ..., varn> for type vars
# args can be given a type with `: Type`
# Type constructors are like normal function applications
# Since types and values never cross, `object` can share its type name with the constructor name.
# Can declare type variables in a type expression with `forall a, b: ...` or `∀a, b:`
# Eg. `def const(val: u24) -> forall t: t -> u24:`. Not sure of the usefulness but seems like we should have the option.
# Arrow types are infix `ArgType -> RetType`.
# Tuple types are written like tuples `(a, b)`
# Superpositions are not in the type system.
# Affinity is not in the type system.
# `Any` type will typecheck with anything. Default type of unchecked things
# Other builtin types are `u24`, `f24`, `i24` and `None` (the type of `*` as a value, a unit type).
def foo <a, b> (x: Ctr(a), y: a -> b, z, w: Any) -> OtherCtr -> b:
  ...
# Can mark a function as having a trusted unchecked type with `unchecked def`.
# This is needed for functions with superposition, unscopeds, other non-lambda things.
unchecked def channel<a, b>() -> (a -> b, b -> a):
  return (lambda $a: $b, lambda $b: $a)
# Untyped arguments and returns are given type `Any`
def foo(a, b):
  ...
# same as
def foo(a: Any, b: Any) -> Any:

Fun syntax

# Functions can be preceeded by a type signature
foo <a> <b> (x: a) (y: b) : a -> b
foo x y = ...

# Signatures can have parens like pattern matching rules
(reverse <t> (xs: (List t)) (acc: (List t))) : (List t)
(reverse (Cons x xs) acc) = (reverse xs (Cons x acc))
(reverse  Nil        acc) = acc

# If only one non-pattern matching rule, we can write the equation and the signature together like in Kind
(reverse <t> (xs: (List t)) : (List t) = (reverse.go xs [])
# Type ctrs use parens around like function calls
# forall is written `forall var body` or `∀var body` like lambdas are. No patterns allowed.
# Arrow type is still infix
# Everything else same as above
fold <a> <b> (xs: (List a)) (cons: a -> b -> b) (nil: b) : b

# Unchecked functions start their signature with `unchecked`
unchecked channel <a> <b> : (a -> b, b -> a) = (@$a b, @$b a)

Hvm definitions are always unchecked, but accept a type

# Uses imp syntax
hvm to_u24 -> forall a: (a -> u24):
  ...

I'm just not sure whether untyped functions should be type checked by default or not. Some possible situations

def foo(a, b):
  # `bar` is well typed and returns type `Type1` 
  x = bar(a, b)
  # Ok either way
  match x:
    case Type1/Ctr:
      ...
  # If `foo` is completely untyped should this be an error or not
  match x:
    case Type2/Ctr:
      ...

def foo2(a: Any, b: Any) -> Any :
  # Now that our function is explicitly typed with any type, should the behaviour change?
  match bar(a, b)
    case Type2/CtR:
      ...
developedby commented 2 days ago

For type definitions, I'll go with

type List <t>:
  Cons { head: t, ~tail: List(t) }
  Nil

type List <t> =
  (Cons (head: t) (~tail: (List t))) |
  Nil

object Pair <a, b> { fst: a, snd: b }

These are always checked, and generate the constructors

List/Cons <t> (head: t) (tail: (List t)) : (List t)
List/Nil <t> : (List t)
Pair <a> <b> (fst: a) (snd: b) : (Pair a b)
developedby commented 2 days ago

With implicit type variable in functions (eg. like haskell)

def foo (x: Ctr(a), y: a -> b, z, w: Any) -> OtherCtr -> b:

foo (x: a) (y: b) : a -> b
foo x y = ...

hvm to_u24 -> (a -> u24):

type List(t):
  Cons { head: t, ~tail: List(t) }
  Nil

# parens optional
type (List t) =
  (Cons (head: t) (~tail: (List t))) |
  Nil

object Pair(a, b) { fst: a, snd: b }