Tagged Values

robinheghan commented 1 year ago

Before Gren 0.1 was released, tuples were removed from the language. This decision was based on "tuple fatigue" that I'd built up over time in both hobby- and work-related Elm projects. Tuples are not necessarily a bad language feature (I think they're perfect for the purpose of pattern matching over multiple values, or even returning multiple values from a function), but since they are so easy to create they tend to end up all over the place. Since tuples are, by nature, brittle data structures they in turn tend to make your program more brittle than necessary, which again complicates refactoring and prototyping.

Ultimately, while convenient, I considered tuples a feature that overall provide negative value to the language.

The reason I bring this up is because the same criticisms can be applied to custom types.

While custom types core feature, being able to represent a choice or a "one of these" situation, they have several shortcommings:

Custom types cannot be inferred. They must be declared prior to use. The compiler cannot parse a case-of expression and infer the type involved.
Data associated to a particular constructor has positional semantics, which is brittle in the same way that tuples are brittle. That is, adding or removing a value from a constructor breaks your program in every pattern match where said constructor is involved, even if the particular value is of little interest.
Custom types cannot be extended, like records can.

In addition, custom types is being used in order to define opaque types, which is a powerful feature that should be able to stand on its own.

Instead, I believe custom types should be replaced by what I'm going to call Tagged Values.

Tags are simple labels. At a glance, they look to be constructors, but they are structural as opposed to nominal.

It might be easier with an example:

type alias Maybe a = <Just a | Nothing>

This is the tagged value representation of the Maybe type. Note that this is a type alias. Tagged values are structural, not nominal.

This means that the following functions has the same type:

map1 : Maybe a -> (a -> b) -> Maybe b

map2 : <Just a | Nothing> -> (a -> b) -> <Just b | Nothing>

Tags can have exactly one associated value. No more, no less. Since it's sometimes useful to represent a tag without an associated value (enums, or the Nothing tag in the examples above) a value-less tag is syntax sugar for a tag associated with the empty record.

This is beneficial from a performance perspective, as it means that all tags can be represented using the same JavaScript type, meaning tags are monomorphic from the JS JIT's point of view. This is also beneficial from a maintanance standpoint, as you have to use records to associate multiple values to a tag, which allows you to add or (potentially) remove values without breaking the program.

Tags are also extensible. The following syntax is allowed:

type alias Tipe a = < a | Ctor1 | Ctor2 >

Tipe in the above example, can represent any union of tagged values, as long as they have a Ctor1 and Ctor2 tag. Such types can be represented in case-of expressions as long as there is a catch-all pattern.

Opaque types

In order to support opaque types, this issue would allow type aliases to be opaque.

module Opaque exposing
    ( Opaque
    , NonOpaque (..)
    )

type alias Opaque = Int

type alias NonOpaque = Int

As an extra benefit, there is a tiny bit less syntax for dealing with opaque types within a module, as you don't need to wrap them in a single-constructor custom type.

New syntax for defining aliases

With custom types removed, there is no need to differantiate between type and type alias.

So defining an alias can be shortened to

alias Opaque = Int

Or, as Roc does it:

Opaque : Int

avh4 commented 1 year ago

To comment on one small aspect of this: currently when using type aliases to records, compiler error messages can easily get unreadable due to the fact that information about the alias name is currently discarded and then not used in the error messages.

Even if that information were tracked internally, there might still might be certain contextx where there might not be an obvious way to determine how to display a certain type (for instance, if there is a type mismatch where one side of the mismatch is an alias, and the other side is a literal structural type).

If some version of this proposal is implemented, imo it's likely that this problem with overly verbose error messages probably will become even more prominent and might need to be addressed in some way.

dbj commented 1 year ago

In Elm, one of the advantages of a single-constructor custom type or a structural type like records is that they can be deconstructed in the parameter list of a function - a feature I use a lot - and one that provides clarity and reduces the verbosity of the code.

Are you planing on allowing parameter deconstruction of Tagged Values, and if so, what would that look like?

robinheghan commented 1 year ago

@dbj parameter deconstruction would pretty much work like today

foo (Ctor { one, two, three }) =
   ...

dbj commented 1 year ago

I was more thinking along the lines that since Tagged Values are structural, could they be treated like records? For example:

type alias Notes = {
    doe : Int
    , rae : Int
}

sing ({ doe, rae } as notes) = …

versus, say

type alias Notes = {
    doe : Int
    , rae : Int
}

type alias Staff = <TrebleClef Notes | BassClef Notes>

sing (<TrebleClef notes | BassClef notes> as staff) = …

robinheghan commented 1 year ago

Ahh.

There's nothing inherently wrong with your example (it would be safe even with custom types). So it can be done. I'm not sure it's a good idea, though 🤔

boxed commented 1 year ago

I'm wondering about the name. I was very confused while reading this. A "tagged value" here is the same as union type? Except as you say it's structurally matched, not nominally.

I agree a million percent with moving to named parameters of some kind from positional. Positional things are evil :P

As for the compiler error messages, I can also see the risk. But I would guess that the vast majority of the time there is exactly one matching alias name available in the compiler. So the compiler could do that lookup in a simple dict. In the cases where there are multiple, it could list all the matching names maybe 🤷

robinheghan commented 1 year ago

I'm wondering about the name. I was very confused while reading this. A "tagged value" here is the same as union type? Except as you say it's structurally matched, not nominally.

The tagged value simply refers to one of the options in the union, really. However, you cannot create unions of other things, so maybe the name needs work.

What I liked about the name is that it makes it clear that you're tagging values, not that the tags are data structures in and of themselves.

As for the compiler error messages...

I agree with @avh4 that this proposal might make error messages worse without some extra effort. I do think the extra effort is worth it, though.

gampleman commented 1 year ago

When you mean structurally typed, does that mean that <User { username : String } | Admin { username : String}> is redundant as both branches have the same structure? Or is the position in the list significant (in which case how do you unify <User { username : String } | Admin { username : String}> with <Admin { username : String} | User { username : String }>?

Or do you just mean that as long as the constructor name is the same and the data argument is the same type (and in that case how does that interact with extensible records?), it doesn't matter where the type came from?

robinheghan commented 1 year ago

Or do you just mean that as long as the constructor name is the same and the data argument is the same type (and in that case how does that interact with extensible records?), it doesn't matter where the type came from?

This.

<User { username : String } | Admin { username : String }> is the same as <Admin { username : String } | User { username : String }>. It doesn't matter where the type comes from, as long as the tags are the same (regardless of order) and that the type associated with each tag is the same.

and in that case how does that interact with extensible records?

You need to be more specific here. I'm not sure what you mean.

gampleman commented 1 year ago

Does <User {a | username : String }> unify with <User { username : String, password: BCrypt }>? I assume yes?

Also I guess some prior art: https://shuangrimu.com/posts/elm-extensible-unions.html

robinheghan commented 1 year ago

Without any further constraints, then yes.

robinheghan commented 1 year ago

Also I guess some prior art: https://shuangrimu.com/posts/elm-extensible-unions.html

Yes. This served as part of the inspiration =)

gren-lang / compiler

Tagged Values #218

Tagged Values

Opaque types

New syntax for defining aliases