cljoly / rusqlite_migration

↕️ Simple database schema migration library for rusqlite, written with performance in mind.
https://cj.rs/rusqlite_migration/
Apache License 2.0
71 stars 16 forks source link

Suggestion: make `fn Migrations::max_schema_version` public #170

Open fluffysquirrels opened 4 days ago

fluffysquirrels commented 4 days ago

fn Migrations::max_schema_version(&self) is a currently private method that returns the number of available migrations wrapped in SchemaVersion.

I would like access to this value in my own code. I'm using Migrations::from_directory() and include_dir! to construct my Migrations, so getting the count myself is non-trivial AFAICT.

My use case is to work out whether any migrations will be run by fn Migrations::to_latest() before running it, and if so to take a backup first.

I created a simple PR for this: #171.

cljoly commented 4 days ago

Thanks for opening this issue and this MR, I'll try to take a deeper look this weekend.

cljoly commented 2 days ago

I think that your use case makes sense and is within the scope of what this library should enable.

That’s said, I’m unsure exposing this method is the most ergonomic option for users. The user needs quite a good mental model of how the crate works. What if instead we exposed either:

  1. a needs_to_run boolean method on Migrations that tells the user whether any migration needs to run? It would be simpler in that users wouldn’t need to compare the version currently applied with the number of migrations, but it could lead to suboptimal uses of the API like this:
    if migrations.needs_to_run() {
     migrations.to_latest()
    }

    This is suboptimal because, as you know, to_latest already performs this check.

  2. a way to run a hook before executing any migration, just like we have a hook at the level of an individual migration. It could something look like this:

    let mut migrations = Migrations::new(vec![
     // ...
    ]);
    migrations.pre_hook(|tx|
     // Perform the backup, using tx as the connection to the database (maybe with the VACUUM INTO pragma).
    );
    
    fn main() {
     // open a connection
     migrations.to_latest(&mut conn)
    }

    I think the additional safety here is that it's much harder to forget to apply the backup hook in some code paths, if the migrations were applied in multiple code paths. And we could run the hook with the other tests we perform on migrations in Migrations.validate(), although that could be a bit of double edged sword.

As a side note, even if we follow one of the options above, we could still make Migrations::max_schema_version public, if there is a distinct and compelling use case for it.

What do you think @fluffysquirrels @czocher?

czocher commented 1 day ago

From what I understand, the main idea here is to be able to tell if there are any unapplied migrations i.e. the schema version is below the number of defined migrations.

In my opinion exposing max_schema_version is an okay idea for some use cases, but:

  1. The name is rather non-specific - doesn't immediately say what it does. Changing it would be necessary. Maybe something like target_schema_version?
  2. Exposing this is just a way to check for "unapplied" migrations with extra steps. Maybe having an additional method for that exact use-case would be better, i.e. are_there_any_unapplied_migrations() -> bool (just a placeholder name of course ;))

What you propose with migration hooks, seems like something general and possibly useful, but I'd separate that to a new task/functionality.

fluffysquirrels commented 17 hours ago

Hi both, thanks for taking a look!

Re naming:

Migrations already has to_latest, so perhaps max_schema_version could become latest_schema_version or latest_version? enum SchemaVersion includes 2 variants -- NoneSet and Outside(_) -- that don't make sense for latest_version, so perhaps it should just return a usize?

You both suggest a method that returns a bool to skip a step for user, which I think is a good plan.

My suggestions for a name:

@cljoly, yes, people may redundantly check is_latest, but the documentation can call out that this is not required.

Re migration hooks: because this particular use case is so simple to implement yourself (1 if statement likely run once per database loaded) and I think a closure is possibly more complicated for the user (due to possible extra lifetimes, see examples below), I don't think it's justified to add hooks to do this.

let mut conn = _;
let mut backup_state = _;
let migrations = _;

// With bool fn, with or without state
if migrations.is_latest(&mut conn)? {
    backup(&mut conn, &mut backup_state)?;
    migrations.to_latest(&mut conn)?;
}

// With closure hook, no state
migrations.add_pre_hook(|conn: &mut rusqlite::Connection|
                        backup(conn));
migrations.to_latest(&mut conn)?;

// With closure hook, callback needs extra shared state,
// `Migrations` is stored and re-used
// (including if `AsyncMigrations` is used!).
let backup_state = Arc::new(Mutex::new(backup_state));
migrations.add_pre_hook({
    let backup_state = backup_state.clone();
    move |conn: &mut Connection| {
        backup(conn, &mut backup_state.lock())
    }
});
migrations.to_latest(&mut conn)?;
fluffysquirrels commented 16 hours ago

I just pushed a few more commits to #171 with these ideas.

Let me know what you think!