ProvableHQ / leo

🦁 The Leo Programming Language. A Programming Language for Formally Verified, Zero-Knowledge Applications
https://leo-lang.org/
GNU General Public License v3.0
4.79k stars 643 forks source link

[Proposal] Sum types in Leo #1419

Open acoglio opened 2 years ago

acoglio commented 2 years ago

💥 Proposal

Leo has circuit types (or struct types, according to the proposal in https://github.com/AleoHQ/leo/issues/1418), which capture the set-theoretic notion of (cartesian) product -- indexed by names, vs. tuples which are indexed by numbers.

Languages normally also have types that capture the set-theoretic notion of sum (i.e. union or disjoint union), to put together values that share certain characteristics but may have different structure. A typical example is a type for geometric shapes that is the sum of types for circles, squares, triangles, etc.

Languages like Rust use enum types for this purpose. Languages like Java use inheritance for this purpose. Since Leo's type system is more similar to Rust than Java, and Leo does not seem to be on an object-oriented trajectory, it may make sense to extend Leo with enum types similar to Rust.

This may involve adding some form of pattern matching, or in any case constructs to build, query, and access enum values.

We will probably want enum types to have member functions (i.e. methods). See the proposal in https://github.com/AleoHQ/leo/issues/1417.

gluax commented 2 years ago

As an alternative, I'd also propose type unions. i.e.

type Option<T> = Some(T) | None;
egregius313 commented 2 years ago

The two main syntaxes for discriminated unions I commoinly see are:

  1. Similar to Rust enum

The languages which use this that I can think of are Rust and Swift. Kotlin's sealed class is somewhat similar.

  1. The tagged union style @gluax is describing

Which from what I can find is the more common syntax. Variations of it are used in OCaml, Haskell (data types), and many functional programming languages.

Notably, some languages unify this syntax with their aliasing syntax (eg. type numbers = int array is valid OCaml), though Haskell and its derivatives commonly separate these functions, with type and newtype being used for aliasing and wrapping, data for tagged/discriminated unions.

One notable use of this syntax which behaves differently though is TypeScript, in which sum/union types are not tagged unions, but instead structural types and aliases. (TypeScript also has enums, but those are more akin to C-style enums where each member has a set value).

type event = {name: "start", when: Date} | { name: "end", after: number }
type pet = cat | dog | fish

From what I can tell, it seems like the enum style seems more common in languages with objects/methods (even though Rust doesn't have objects per-se, it does commonly use functions/methods which are dispatched on the first argument which acts similar to a "receiver"), and type ... = ... | ... seems more common in functional languages which seldom support overloading without some other construct (eg type-classes or traits).

So when it comes to syntax, I would say the question is which style of sum type we want to follow?

Centril commented 2 years ago

As an alternative, I'd also propose type unions. i.e.

type Option<T> = Some(T) | None;

N.B. A type union would be e.g., u8 | String which is equivalent to u8 | u8 | String, so they don't allow you to express e.g., Option<Option<T>>.

Rust could have used data instead of enum but chose the latter for the sake of familiarity. I would suggest that we simply rip of Rust's syntax since we're going in that direction anyways. Atop of that, if we want structural records and structural sumtypes, we could have { r: u8, g: u8, b: u8 } and enum { A, B, C(u8) }. These do seem like useful things to add, but first I'd start with nominal types and add pattern matching.