umccr / orcabus

 🐋 UMCCR Pipeline & Workflow Orchestration
3 stars 0 forks source link

filemanager: transition migrations from sqlx to sea-orm #378

Open mmalenic opened 4 days ago

mmalenic commented 4 days ago

After #357 is merged, the filemanager query logic could be transitioned away from using .sql files and sqlx for migrations, and instead using sea-orm for migrations. Note, sqlx and sea-orm can coexist together for the time being because sea-orm uses sqlx.

mmalenic commented 3 days ago

After looking into this, I don't know if using sea-orm for migrations is the best. It seems a bit verbose, and for table definitions, raw SQL could be enough. For example, the object table definition would be something like:

use crate::migrations::m0001_create_table::ObjectGroup::{Attributes, ObjectId};
use sea_orm::DeriveIden;
use sea_orm_migration::prelude::{DeriveMigrationName, Table};
use sea_orm_migration::schema::{json_binary, uuid};
use sea_orm_migration::{async_trait, DbErr, MigrationTrait, SchemaManager};

/// The object_group table.
#[derive(DeriveIden)]
enum ObjectGroup {
    Table,
    ObjectId,
    Attributes,
}

/// The migration struct used to implement `MigrationTrait`.
#[derive(DeriveMigrationName)]
pub struct Migration;

#[async_trait::async_trait]
impl MigrationTrait for Migration {
    async fn up(&self, manager: &SchemaManager) -> Result<(), DbErr> {
        manager
            .create_table(
                Table::create()
                    .table(ObjectGroup::Table)
                    .col(uuid(ObjectId).primary_key())
                    .col(json_binary(Attributes))
                    .to_owned(),
            )
            .await
    }

    async fn down(&self, manager: &SchemaManager) -> Result<(), DbErr> {
        manager
            .drop_table(Table::drop().table(ObjectGroup::Table).to_owned())
            .await
    }
}

vs the following in plain SQL:

-- An general object table common across all storage types.
create table object (
    -- The unique id for this object group.
    object_id uuid not null primary key,
    -- Attributes for a group of objects.
    attributes jsonb default null
);

The sea-orm example does have a down script, which could be useful - does that outweigh the extra code/verbosity? I'm open to either approach. We could also keep this issue paused for a bit because we can always transition later if we want.

Note, I think using sea-orm for queries is definitely better than raw SQL, because it's much more concise. This issue doesn't affect transitioning to using sea-orm in general.