maxpert / marmot

A distributed SQLite replicator built on top of NATS
https://maxpert.github.io/marmot/
MIT License
1.81k stars 41 forks source link

What is the best way to handle schema migrations? #58

Closed mindreframer closed 1 year ago

mindreframer commented 1 year ago

Currently the triggers are only attached during the start phase of marmot. Individual schema changes (new tables / columns) are not reflected at runtime and require either a manual way to apply them on all DBs in the cluster + restarting each marmot node to update triggers and changelog tables.

Are there some recommendations how to approach it in a straightforward and maintainable way?

Best, Roman

maxpert commented 1 year ago

Yes, right now what you have described is the case and the only way to handle schema changes is via stopping Marmot, dropping a snapshot on all nodes, and restarting the Marmot process. The major reason to not implement schema changes propagation is because:

Having said that, here are my recommendations:

I've been brain-storm about this topic, and I have questions that I believe community can answer. It will really help me develop a solution that works really well for community:

Again I am open to suggestions and contributions on this topic.

mindreframer commented 1 year ago

@maxpert thanks for the elaborate response!

The algorithms that you describe involves downtime. Also, if I understand correctly, the migration runs on a single marmot node , and the distribution to other nodes happens via snapshots, so that other nodes would be required to download the snapshot to be up-to-date? If that is the case, it would be problematic with bigger database (100s of GB), and the migration would involve a long time.

How it could work:

maybe it would be possible to have a special table, that marmot would use to signal the state of the migration. A rolling update is good, and having some way to see the progress of the roll-out through the cluster would be great. Topic in NATS for migration:

How are migration executed in general is a very broad question, since there are so many tools to manage migrations. Some framework specific (Rails / Phoenix / Django), some language specific (https://atlasgo.io/, https://github.com/amacneil/dbmate, https://flywaydb.org/). I guess relying on those instruments would make support very hard / impossible, so the best way would be an internal simple way to apply a SQL script on all DBs in the cluster in a controlled fashion.

I guess the simplest would be to post a SQL file with 2 attributes: name + timestamp as identifier for this particular migration. We could start with only forward migrations, because supporting reversible migrations does not really work in production. We would also need to store applied migrations in a table, maybe "__marmot_migrations"?

Maybe we should collect those ideas in a documents (could be a markdown file in the repo), with arguments why those design decisions where preferred. From that document we could distill the minimal feature set, that would just solve the issue at hand.

maxpert commented 1 year ago

@maxpert thanks for the elaborate response!

The algorithms that you describe involves downtime. Also, if I understand correctly, the migration runs on a single marmot node , and the distribution to other nodes happens via snapshots, so that other nodes would be required to download the snapshot to be up-to-date? If that is the case, it would be problematic with bigger database (100s of GB), and the migration would involve a long time.

No we are not restoring snapshots, I am using snapshots as safety measure. You actually run the migration scripts everywhere.

How it could work:

maybe it would be possible to have a special table, that marmot would use to signal the state of the migration. A rolling update is good, and having some way to see the progress of the roll-out through the cluster would be great. Topic in NATS for migration:

  • this would be great! We could mark each database as being in-progress for migration (possibly lock it for modification), stop accepting writes, drain the pending changes from the changelog tables and then run the migration script in parallel on all nodes. Once the complete cluster is finished, we mark tables as ready for writes and release the lock. The application could check status and display a friendly maintenance page during migration.

How are migration executed in general is a very broad question, since there are so many tools to manage migrations. Some framework specific (Rails / Phoenix / Django), some language specific (https://atlasgo.io/, https://github.com/amacneil/dbmate, https://flywaydb.org/). I guess relying on those instruments would make support very hard / impossible, so the best way would be an internal simple way to apply a SQL script on all DBs in the cluster in a controlled fashion.

I guess the simplest would be to post a SQL file with 2 attributes: name + timestamp as identifier for this particular migration. We could start with only forward migrations, because supporting reversible migrations does not really work in production. We would also need to store applied migrations in a table, maybe "__marmot_migrations"?

Excellent you are essentially converging to same idea of having migration script essentially being published on NATS, and then applying them everywhere.

Maybe we should collect those ideas in a documents (could be a markdown file in the repo), with arguments why those design decisions where preferred. From that document we could distill the minimal feature set, that would just solve the issue at hand.

I can introduce a RFC label under issues, but in long run we might have to move to different repo like https://github.com/reactjs/rfcs