canonical / dqlite

Embeddable, replicated and fault-tolerant SQL engine.
https://dqlite.io
Other
3.83k stars 216 forks source link

Recommended way to perform schema migrations #500

Closed ktsakalozos closed 6 months ago

ktsakalozos commented 1 year ago

Hi dqlite team. Given the distributed nature of dqlite where we may not have full control over where the leader is, are there any recommendations on how we should perform data schema migrations?

stgraber commented 1 year ago

The way we do it in LXD is that all servers check a table that will never have its schema changed. That table includes the current schema version for each server in the cluster.

On startup servers update their schema version number in that table and then will wait until all servers are reporting the same schema version.

The leader then has a slight variation to that which is that it will wait for all servers except itself to report the new version, then will apply the schema update and update its own record.

This in turn causes all the other servers to notice that all servers are ready and have them complete startup.

With that, you end up with a database that keeps functioning on the older daemons until they're all restarted.

ktsakalozos commented 1 year ago

Thank you for the reply @stgraber. Do you use snap epochs (or any other mechanism) so as to ensure users do not jump (upgrade or downgrade) to a snap track/revision that does not know how to migrate the current schema to the one it knows of?

stgraber commented 1 year ago

No, LXD generally doesn't remove support for older releases.

We did do one break with 5.0 but didn't end up using epochs as we had some issues with them at the time. 4.0 is now the oldest that can upgrade to 5.0, but you can otherwise go from 0.1 straight to 4.0 and then to 5.13 and LXD will handle it just fine.

freeekanayaka commented 1 year ago

The way we do it in LXD is that all servers check a table that will never have its schema changed. That table includes the current schema version for each server in the cluster.

On startup servers update their schema version number in that table and then will wait until all servers are reporting the same schema version.

The leader then has a slight variation to that which is that it will wait for all servers except itself to report the new version, then will apply the schema update and update its own record.

I don't really remember the details, but if we're speaking strictly about the database I think it doesn't need to be the leader that runs the schema upgrade, it can be whoever the first node is that notices that all other nodes have upgraded their version (i.e. the whole cluster is now running the same version). Since a schema upgrade is a SQL transaction, if 2 nodes race because they notice that the cluster is upgraded and they both try to upgrade the schema at the same time, then only one of them will succeed and the other will fail gracefully.