clockworklabs / SpacetimeDB

Multiplayer at the speed of light
https://spacetimedb.com
Other
4.41k stars 110 forks source link

Add `AlgebraicType::Transparent { ty: Box<AlgebraicType>, name: Box<str> }` #1287

Open Centril opened 6 months ago

Centril commented 6 months ago

What

The idea here is that we add a variant to AT (AlgebraicType) that looks like this:

struct TransparentType {
    ty: Box<AlgebraicType>,
    name: Box<str>,
}

pub enum AlgebraicType {
    ...
    Transparent(TransparentType),
}

This is a newtype around the ty, with a name assigned to it.

There won't however be a TransparentValue or such, as the type is representationally transparent, meaning that an AV compatible with ty is also compatible with the Transparent(...) wrapper around ty.

Why

The representational transparency aforementioned is the whole point. Currently, our Identity and Address types are typed in the database as:

AlgebraicType::product([("__identity_bytes", AlgebraicType::bytes())]) // Identity
AlgebraicType::product([("__address_bytes", AlgebraicType::bytes())]) // Address

But this also means that the AV representation is AV::Product(vec![inner_av]) which incurs branching and an unnecessary heap allocation. This also extends to indices, which need to materialize AVs in the general case (although we could do a one-field-product optimization for indices). Currently, this also affects normal scans which also materialize AVs.

With ::Transparent, we can get rid of this wrapping layer and have the AV representation just be inner_av. Coupled with #1097, this makes it very cheap to store an Identity.

Alternatives

Bikeshed

I chose Transparent to mirrior #[repr(transparent)] in Rust.

We might also name this NewType or some such.

kazimuth commented 6 months ago

Makes sense. It would be nice to have all single-field products optimized in this way, but I suspect it might make iteration code fairly hairy.

Another, compiler-y approach would be to have a compiler IR that supports isomorphisms (reversible encoders) iso: LogicalType -> OptimizedType, iso_op: OptimizedType -> LogicalType, iso ▷ iso_op = id, iso_op ▷ iso = id. You would write your code against the naive representation LogicalType, and automatically insert the conversions to/from OptimizedType at the boundaries of the code. Then you add compiler transforms to fuse the user operations on LogicalType with the encoding/decoding operations on OptimizedType. This would allow you to simplify the type system -- the user no longer has to distinguish between transparent and non-transparent tuples at the type level, that's merely an encoding detail. You do need a good compiler though, which is a large amount of work.

cloutiertyler commented 6 months ago

You know I've actually thought about this problem a lot because a single field product type is actually functionally identical to a single field sum type. I think it makes sense to have a special case for it if it improves performance, although maybe the name should be singleton since it's a 1-tuple. Sum and product types are only distinguished by the operators that combine the fields after all:

() # Unit
(foo: Foo) # Singleton
(foo: Foo * bar: Bar) # Product (ordered pair)
(foo: Foo + bar: Bar) # Sum
cloutiertyler commented 6 months ago

I would say however, that this is really not a priority

Centril commented 6 months ago

although maybe the name should be singleton since it's a 1-tuple.

The notion of a "singleton type" is already used for a type inhabited by a single value in Scala, Haskell, and type theory:

The notion of Transparent I'm proposing here is more of a newtype with a "coercion" for values, but Transparent I feel says more about the semantics.

mamcx commented 6 months ago

Looks fine. Another naming candidate: alias

cloutiertyler commented 5 months ago

Does this impact BSATN at all?

Centril commented 5 months ago

It does not affect the BSATN of AlgebraicValue at all, but it does add a non-breaking change to the BSATN of AlgebraicType itself.

gefjon commented 5 months ago

How do you codegen for a transparent type?