tiesselune / reindeer-rs

A thin entity layer around Sled using Bincode for serialization
MIT License
12 stars 2 forks source link

Derive and Migrations for Reindeer #10

Open tiesselune opened 1 year ago

tiesselune commented 1 year ago

Problem

Right now, updating the data structure requires some kind of manual migration. The serialization method of Reindeer being bincode, there is no clear schema for the data, and modifying the fields of an Entity in the codebase will result in an error when trying to de-serialize data from the database.

For instance, let's say we define a User entity :

#[derive(Serialize,Deserialize)]
pub struct User {
    username : String,
    password_hash : String,
    last_login_timestamp : i64,
}

If we create a few Users and save them to the database, but later need to add a role field by editing our struct this way :

#[derive(Serialize,Deserialize)]
pub struct User {
    username : String,
    password_hash : String,
    last_login_timestamp : i64,
    role : u32, // <---- offending field
}

Then bincode will fail at de-serializing previously-saved users, because it won't be able to find the superfluous field.

Right now, we can avoid the problem by creating a sibling Entity in which we will add the additional data, but this still requires a manual migration, adding this data to all existing Users, and it will divide data between two sibling entities (adding a new sibling does not change the data structure of the entity), sacrificing some performance.

The other solution is to create a migration script defining two structs, UserNew and UserOld, and manually deserialize every existing entry, then transform it to the new format, then re-save it in place. But this is error prone, fastidious and an all-around bad developer experience.

Plus, implementing Entity is boilerplate-y, and could be simplified a lot with a derive macro.

Proposition

Create a semi-automatic migration process, so that developers can define a migration function to run on every saved version of an entity, and have it be run at app startup during Reindeer's initialization.

Solution draft

  1. Split Reindeer crate into reindeer and reindeer-macros to have a clean dependency graph with Rust's proc-macro requirement to have proc-macros into separate crates.
  2. Implement a derive macro #[derive(Entity)] with a helper attribute entity that will
    1. Implement the Entity trait automatically given an optional store name, a version number, and the id structure for the tree
    2. Create a schema folder with the current struct version, containing a JSON or TOML representation of the struct.
    3. Create Structs (User_v1, User_v2 ...) for each of the existing files in the Schemas folder with version numbers
    4. Implement Entity on them as well
  3. Create a sibling and a children helper attributes to link to children and siblings automatically.
  4. Save version and Entity Schema in the database to detect a version incompatibility at registration time, prompting for a migration and providing the two conflicting schemas to generate the migration
  5. Provide a way to migrate from a version to another, given the automatically generated structs

This way, creating an entity would look like this :

#[derive(Serialize,Deserialize,Entity)]
#[entity(name = "user", version = 2, id = "username")
#[siblings(("user_data",Cascade))]
pub struct User {
    username : String,
    password_hash : String,
    last_login_timestamp : i64,
    role : u32
}

and migration setup could look like this :

impl From<User_v1> for User { /* ... */ }
User::set_migration_v1_to_v2(|existing : User_v1| { existing.into() });