cda-group / arc

Programming Language for Continuous Deep Analytics
https://cda-group.github.io/arc/
44 stars 6 forks source link

RFC: Improved type system #311

Closed segeljakt closed 1 year ago

segeljakt commented 3 years ago

The elephant in the room when writing Arc-Scripts is the weak type system. Currently, the type system supports:

Problem

The type system is however missing backwards type inference and polymorphism, which makes it very restrictive. Without backwards type inference, there are cases where a type of a value must be inferred before it can be used. Without polymorphism every term has exactly one type. We are in particular missing:

User-defined nominal type constructors

Without user-defined nominal type constructors, we cannot for example define a custom Option<T> type. The only solution right now is to create a new Option type for every argument, e.g., Option_i32 instead of Option<i32>. This problem should however be fairly easy to solve. We just allow users to add type parameters to declarations and instantiate those type parameters with fresh type variables whenever the declarations are used.

Row-polymorphism

Without row-polymorphism, the following problem happens:

fun foo(x: {a: i32, b: i32}): i32 { x.a + x.b }

foo({a: 5, b: 5}); # OK
foo({a: 5, b: 5, c: 5}); # ERROR, expected {a: i32, b: i32}, found {a: i32, b: i32, c: i32}

To support row polymorphism it might be enough to change the unification algorithm. When unifying {a: 5, b: 5, c: 5} with {a: i32, b: i32}, we see that the former is a subtype of the latter, and the result is {a: 5, b: 5, c: 5}.

Overloading

Without overloading, functions cannot be overloaded to have different behaviour for different types. This for example means addition and subtraction can only work on one numeric type. Therefore the following is not possible:

val x: i32 = 3;
val y: i32 = x + x; // OK

val x: f32 = 3.0;
val y: f32 = x + x; // ERROR

Semantics need to be "syntax directed". For example, OCaml distinguishes integer addition + from floating point addition +.. Since our numbers can have many different variations (signed/unsigned, floating point/integers, and varying number of bits), syntax directed solutions are not going to work.

I think the solution is that we need to be able to constrain types to their capabilities. For instance, two values can be added together if their types are equal and also numeric. Constraints can be inferred from how the type is used. The question is whether we should allow user-defined constraints or stick to builtin constraints. Languages like Haskell/Rust/Scala allow user-defined type classes while domain-specific languages have builtin constraints.

Coercion

Not having coercion means explicit casts will be needed for type conversions. It is not possible to do the following:

val x: i32 = 5;
val y: i64 = 5;
val z = x + y;