savi-lang / savi

A fast language for programmers who are passionate about their craft.
BSD 3-Clause "New" or "Revised" License
154 stars 12 forks source link

Efficient (unboxed) tagged unions #388

Open mneumann opened 1 year ago

mneumann commented 1 year ago

Would be nice to write code like this:


:fun some_values
  :yields (U64 | None | F64)
  yield U64[1]
  yield None
  yield F64[1.3]

:fun caller
  @some_values -> (|value|
    case value <: (
    | U64 |  ...
    | None | ...
    | F64 | ...
    )
  )
jemc commented 1 year ago

I think something like this would need to ride on the back of the "fat pointers"-related changes that need to happen, as both relate to keeping type information as part of the use site of the runtime value (rather than in pointed-to memory).

In other words, I think what you're describing is mostly just a "fat value" (no pointer).

mneumann commented 1 year ago

Not sure if this is related to fat-pointers...This could also be just another built-in type :union that behaves like a Rust :enum. A Savi union type (MsgPack.Nil | MsgPack.Bool | ...) would be anonymous, so one would have to wrap it in a struct to make it a newtype.

A valid use-case is MessagePack.Token (https://github.com/mneumann/savi-MessagePack/blob/main/src/MessagePack.Token.savi) which tries to simulate a tagged union but requires lots of boiler-plate code in order to do so. In Rust the same would look like:

enum MessagePackToken {
  Nil,
  Bool(bool),
  UInt(u64),
  Int(i64),
  Float(f64),
  BinaryAhead {len: u32},
  StringAhead {len: u32},
  ArrayAhead {nelems: u32},
  MapAhead {nitems: u32},
  ExtAhead {type: u8, len: u32}
}

And the sizeof MessagePackToken would be the size of the tag + sizeof the largest "case". This, combined with structural pattern matching is very powerful, but is less OOP.

jemc commented 1 year ago

The approach I'm thinking of that piggybacks on "fat pointers", could be thought of as "fat values", but the result is the same as what you wrote here:

And the [size] would be the size of the tag + sizeof the largest "case"

The difference is that with the approach I'm suggesting we should be able to use anonymous type unions (like U64 | I64 | F64 | Pair(U64, U64) in your initial post) and have those get quietly converted into a tagged union / fat value under the hood, without boxing.

The advantage is that you need not declare all the cases ahead of time in a :union declaration - instead you can create arbitrary ad hoc unions using | that contain whatever types you need for your particular purpose.

mneumann commented 1 year ago

That sounds good! So If I want:

enum Expr {
  Add(u32, u32),
  Sub(u32, u32)
}

We would just write in Savi:

:struct Add
  :let a U32
  :let b U32
  :new (@a, @b)
// maybe one day this can be shortened to just:
:data Add(a U32, b U32)

:struct Sub
  :let a U32
  :let b U32
  :new (@a, @b)

:alias Expr = Add | Sub

And If I would want a newtype, I'd write instead of the alias:

:struct Expr
  :let inner (Add | Sub)
  :new(@inner)
jemc commented 1 year ago

Yes, that's the idea.