MathiasPius / erm

MIT License
1 stars 0 forks source link

Feature: Enums #2

Open MathiasPius opened 5 days ago

MathiasPius commented 5 days ago

Why

Rust's enums are fantastic to work with, especially for expressing state machines.

What

Implementing Component for such a type would in most cases be impossible due to the different data models of each branch, being able to derive Archetype directly on a structured enum would be an amazing feature to have.

Figuring out the best way of managing serialization and deserialization in a safe way is key.

How

There are a couple of different approaches:

Untagged

Rely on distinct branches, and reconstruct the enums from a series of "optional" sub-queries. This is similar to serde's untagged enum method, where you just attempt to deserialize each branch one after the other until one is successful. The disadvantage is that tag information is lost, which results in faulty deserialization:

[#derive(Archetype)]
enum MyState {
  Off(User),
  On(User)
}

Since we store neither Off nor On, deserializing this struct will always yield Off.

Tagged

When deriving Archetype for an enum, simultaneously derive Component for it as well, and store the branch arm in a distinct table.

with the above example, we would then additionally get a component table MyState with a field tag always containing either On or Off

During deserialization this enum component would then get deserialized just like any other and used to reconstruct the enum:

fn deserialize(row: &mut OffsetRow) -> Result<Self, sqlx::Error> {
    let user = <User as Deserializable<DB>>::deserialize(row)?;

    Ok(match row.try_get::<String>()? {
        "On" => {
            MyState::On(user)
        },
        "Off" => {
            MyState::On(user)
        }
    })
}

Downside of this approach is that it sort of breaks the Component/Archetype model. By storing information about the enum in a table for itself, it's a component, but this is not exactly clear from the definition, and you will have to instantiate it with Backend::register same as other components.

If you have enums which consist of a mix of plain non-component rust types like i64 and component/archetype types as above, it would be great to have that "Just Work":

[#derive(Archetype)]
enum MyState {
  On {
    user: User,
    counter: i64
  },
  Off {
    user: User,
    counter: i64
  },
}

Since we're already storing the tag, there's no reason we can't store the counter value alongside it.

MathiasPius commented 5 days ago

Hit a snag while implementing the Tagged solution: The derive macro has no way of knowing whether a field is a regular SQL-serializable type, or a Component in its own right.

In regular components, this is not an issue since components cannot themselves contain components.

One solution to this would be to use a marker like #[external] on fields which are themselves components or archetypes:

[#derive(Archetype)]
enum MyState {
  On {
    #[external]
    user: User,
    counter: i64
  },
  Off {
    #[external]
    user: User,
    counter: i64
  },
}

Another solution would be to split enum implementations into component or archetype versions, so Component versions can only contain non-component fields while Archetype versions can only contain component fields.

MathiasPius commented 5 days ago

An enum which can only contain fields that are components in their own right is identical to a Marker trait:

(MyStateOn, User) and (MyStateOff, User) is equivalent to (MyState, User) where MyState is one of Off or On.


Enums which only contain non-Component fields easily maps each branch to an individual component type.

MathiasPius commented 1 day ago

I'm leaning towards having enums contain Component/Archetypes only, disallowing basic types. That way it can be used to differentiate between entity "types", which is an immediately useful feature. The others may come later.


This also means we need to do away with the tag, otherwise we risk running into a situation where the enum's tag indicates one branch, but the contained archetypes have been removed, leaving the database/enum in an inconsistent state.