inko-lang / inko

A language for building concurrent software with confidence
http://inko-lang.org/
Mozilla Public License 2.0
899 stars 41 forks source link

Introduce support for user-defined, stack allocated value types #750

Open yorickpeterse opened 2 months ago

yorickpeterse commented 2 months ago

Description

Instances of classes are heap allocated by default, with the exception of Int, Float, Bool, and Nil. This is what allows Inko's borrowing mechanism to work, and crucially allows moving of values while borrows exist, removing the need for lifetime checking.

The downside is that heap allocating introduces an allocation cost, and extra indirection. In many instances this isn't warranted/desired, especially for small values.

Today one can sort of work around this by using class extern, which defines a class using the C ABI and without an object header. These types are then treated as value types and copied upon being moved or borrowed.

We need to extend this with a class inline declaration that defines a class to be a value type. Such types behave similar to class extern in that they are copied upon moves and borrows, and don't have object headers. For methods this poses a bit of a challenge: they should take their receiver by reference but be able to return a new (copy) if necessary. This means that we have to allow passing a ref Whatever/mut Whatever to a Whatever and copy it upon doing so.

Dynamic dispatch and casting to traits would be disallowed just as with Int and Float.

The extra restriction is that they can't contain non-value types as that would allow violation of the single ownership rule. This means they an only contain other class inline types, or types such as Int and Float. Channel and String using reference counting means you can't store these in a class inline, because they're not copied by just copying some bits.

A difference with class extern is that a class inline type makes no guarantee about the order or padding of fields, while class extern is meant for cases where you need a guaranteed order (aka be compatible with C code).

To allow use of these types in generics, we'd need to generate a distinct shape for the size and alignment of each type. This means different class inline types with the same layout use the same shape.

A challenge/downside of this setup is that the limitations of class inline mean that in general you can only store Int, Float, Bool and Nil in them. This in turn makes them rather useless.

A refinement would be to still subject these values to single ownership and thereby allowing you to store non-value types in them. This however makes borrowing difficult because we can't copy upon borrowing, we can't move (since that makes borrowing impossible), and the lack of a header means we can't do reference counting. Even if we include the object header, the value residing on the stack means borrows are invalidated when the value is moved. The only two ways to prevent that from happening is:

  1. Check the reference count upon moving, incurring a cost when moving, which I'd prefer not to have
  2. Have some form of compile-time borrow checking, but this likely complicates the type system a lot

In short, this will need some extra thought.

Related work

yorickpeterse commented 2 months ago

An alternative that I'm also strongly considering is to rely entirely on escape analysis similar to e.g. Go, and stack allocate anything we can determine doesn't escape a stack frame. You can then borrow it all you want and it will work.

The downside is that values stored in arrays would still need to be heap allocated, since moving the data in/out the array could invalidate borrows.

yorickpeterse commented 2 months ago

Also important to consider: if we want to allow value types to be used in generics, and generate a shape over the size/alignment instead of the class, then we have to include an object header such that the generic code can perform dynamic dispatch. If we don't want an object header then we have to generate a shape for the exact class such that we can perform static dispatch in generic code.

yorickpeterse commented 2 months ago

Per https://www.reddit.com/r/swift/comments/m5zhlf/reference_type_inside_a_struct_automatic/, it seems that Swift allows RC types inside structs and increments them upon copying. We could do something similar to allow String and Channel in class inline types, but this doesn't make them that much more useful.

yorickpeterse commented 2 months ago

Another note: if we allow custom value types, we need to handle recursive value types and produce a compile-time error (i.e. a Foo storing a Bar which in turn stores Foo, all value types).

yorickpeterse commented 2 months ago

A use-case I'm currently already seeing for value types is thin wrappers around platform-specific types. For example, file descriptors are an Int32 on Unix, but on Windows a HANDLE is used which IIRC is a pointer. Currently some of our code (e.g. std.fs.file.ReadOnlyFile) use Int32 directly.

We could do something like this to work around the need for heap allocating:

# For Unix:
class extern File {
  let @fd: Int32
}

# For Windows:
class extern File {
  let @fd: Pointer[UInt8]
}

However, in this case we're abusing C types as value types, rather than using first-class value types.

yorickpeterse commented 2 weeks ago

If/when we add this, I think we might also want to consider replacing class with type, as in type User { ... } instead of class User { ... }. The reason for this is that in most languages that make a distinction between heap and stack types, class is used for heap types and struct for stack types. I however don't like the distinction between class and struct since it's mostly arbitrary. Using type avoids it entirely, and also reinforces the notion that we don't have classes in the traditional sense.

yorickpeterse commented 5 days ago

I'm working on this as part of https://github.com/inko-lang/inko/pull/778