A manager object is needed to resolve the following difficulties:
models need to have their indices synced with the backend. We want this to be done as close to startup time as possible, and only once per the lifetime of the program's instantiation. If the backend is unreachable, we want the models to be synced at some future point in time without crashing the service. This is simple enough. Will have it done in no time.
models are going to accrue some number of logical schema migrations over time. We need a way to automate the mutation/evolution of data — which already exists in a live database — in a straightforward & simple way. This last point merits greater discussion, as there are many different approaches to data schema migrations.
proposed schema migrations pattern
run from within the service, where the models are defined, similar to how Model::sync works for indices and such. Not from a separate CLI system.
migration manager will ensure the migrations for a specific model are run only once per service lifecycle (when the container or process is first booted).
should not panic.
will still be able to be executed if the backend is down (it shouldn't be) when the service first comes up (probably an exponential backoff algorithm).
update the model, perhaps using Option<T> for the field type. Serde will deserialize the record from the database as None. Then deal with that condition in your model code if you don't want it to be None.
manager will execute Model::migrate, which will receive its execution orders from Model::migrations, which will run whatever mutations against the database are needed according to the migrations specs (these should always be coded to be idempotent).
phase two
code deployments for the updated service, which includes the execution of the migrations, have now been finished. All of the service's replicas are updated & using / expecting the updated schema.
now the Option<T> fields may be coded as simple T (not wrapped in an Option) as the old data has been fully updated from the migrations.
pros/cons
pro: faster schema convergence compared to document versioning.
pro: purely based on code in the service, works directly with your current service deployment patterns.
pro: undoing a schema migration is much safer as it involves a service deployment with the inverse of the migration being undone.
pro: no need for a directory of loosely managed JS files in your code base.
pro: migration is recorded in a migrations table so that it can't be run again.
pro: can be undone.
con: must be run outside of the API, using a CLI.
con: managing config for the separate environments is overhead (think sequelizejs).
con: can be quite tricky in high-throughput environments, especially when doing zero-downtime deployments (which include migrations), especially with blue-green style deployments where old versions of a service may be expecting the old schema but the migration should be applied before the new code is deployed ... and high-throughput. Not fun.
pro: pretty awesome, and definitely a bit cutting edge for the SQL ecosystem.
con: wouldn't work well in the mongo ecosystem as schemas are not enforced on the DB layer and schema divergence may occur before all service instances are deployed.
document versioning
pros/cons
con: not even really a migration pattern ... but you can at least see the documents version and take imperative action on the document to update it as needed.
con: extremely slow schema convergence. Data is left very messy. Difficult to reason about the state of the database at any point in time. Schema may never converge.
far out option
Leverage Rust nightly plugins/custom attributes/&c to define a system which will use the document __version field to automatically handle document updates from version to version, removing fields, changing field types, adding new fields &c from lower versions up to the latest version of the Model.
This would be shooting for the stars ... and I don't currently have time to explore this. But it would be fucking awesome. And better than anything else out there in any other language, including interpreted languages (Python & Ruby have solid stories in this realm).
A manager object is needed to resolve the following difficulties:
proposed schema migrations pattern
Model::sync
works for indices and such. Not from a separate CLI system.#[serde(default)] | #[serde(default = "path")]
could not be used.two-phase schema migration pattern
phase one
Option<T>
for the field type. Serde will deserialize the record from the database asNone
. Then deal with that condition in your model code if you don't want it to beNone
.Model::migrate
, which will receive its execution orders fromModel::migrations
, which will run whatever mutations against the database are needed according to the migrations specs (these should always be coded to be idempotent).phase two
Option<T>
fields may be coded as simpleT
(not wrapped in anOption
) as the old data has been fully updated from the migrations.pros/cons
schema migration pattern comparison
django style schema migrations (SQL)
pros/cons
diesel style schema migrations
pros/cons
document versioning
pros/cons
far out option
Leverage Rust nightly plugins/custom attributes/&c to define a system which will use the document
__version
field to automatically handle document updates from version to version, removing fields, changing field types, adding new fields &c from lower versions up to the latest version of theModel
.This would be shooting for the stars ... and I don't currently have time to explore this. But it would be fucking awesome. And better than anything else out there in any other language, including interpreted languages (Python & Ruby have solid stories in this realm).