An update of tarantol version assumes the following steps:
Restart using a new tarantool executable
Upgrade the database schema
The reverse steps are also possible:
Downgrade the database schema
Restart using an old tarantool executable
Restart using a new tarantool executable
Procedure
A usual 'zero downtime' tarantool version update procedure is briefly described below. Let's assume that we're updating version `vA` to version `vB`.
For each replicaset:
* `vA` replica is stopped, `vB` replica is started on its snapshots.
* It is repeated for all the replicas. A master is kept on `vA`.
* The `vA` master is switched to RO, one of `vB` replicas switched to RW.
* The last `vA` replica (the old master) is stopped and started as `vB` replica.
This way a replicaset is accessible for read requests all the time and accessible for write requests almost all the time -- except the master switch step.
If a downtime is acceptable for the given replicaset (typical for a dataless router if requests are balanced over several of them), the steps are simpler:
* Stop `vA` instance(s).
* Start `vB` instance(s) using existing snapshot(s).
> [!NOTE]
> This procedure should be repeated for all the replicasets in the cluster.
Now the service works on a new tarantool version, but the next step sometimes required to unblock all the new features of the new tarantool version.
Upgrade the database schema
For each replicaset:
Run box.schema.upgrade() and then box.snapshot() on master.
For each replica:
Wait until the replica applies transactions that are produced by box.schema.upgrade() (compare vclocks[^1]).
[!NOTE]
This procedure should be repeated for all the replicasets in the cluster.
[!NOTE]
The snapshot creating is required for some of the upgrades, because the upgrade may perform operations that are normally forbidden. The triggers that would forbid them are disabled for a fiber that performs the upgrade. However, if the instance is restarted and the upgrade operations are read from WAL and performed, an error occurs, because the triggers are enabled now. The result is that the instance can't be started.
Now all the features of the new tarantool version are unblocked.
Dowgrade the database schema
Same as the upgrade, but box.schema.downgrade() should be called instead of box.schema.upgrade().
Restart using an old tarantool executable
The same as restart using the new tarantool version, but new versions are stopped and old ones are started.
Proposal
It is a bit complicated to perform all the required operations for the database schema upgrade/downgrade manually on masters and replicas in the described order and with appropriate waiting for synchronizing.
It would be nice to have some automation that performs it, reports the progress and the accumulated result.
My proposal is to implement the steps from the 'upgrade the database schema' section above as a tt command. We may also need a companion command that performs the downgrade and a command that reports a state of the cluster in regards to the database schema version.
If we do, it would be also nice to update our upgrade documentation with a suggestion to use the commands instead of manual steps.
The nice part is that (as far as I understand) tt already knows a list of instances in the cluster, how they're grouped into replicasets (at least in case of the declarative configuration from tarantool 3.x), knows how to determine an RW and an RO instance and able to perform commands on a local and a remote instance. It looks like a framework for implementing such administration scenarios.
(After some discussions we agreed to move remote instances support to a separate step, see #968. It is still in our plans, just we'll do it separately.)
Context
An update of tarantol version assumes the following steps:
The reverse steps are also possible:
Restart using a new tarantool executable
Procedure
A usual 'zero downtime' tarantool version update procedure is briefly described below. Let's assume that we're updating version `vA` to version `vB`. For each replicaset: * `vA` replica is stopped, `vB` replica is started on its snapshots. * It is repeated for all the replicas. A master is kept on `vA`. * The `vA` master is switched to RO, one of `vB` replicas switched to RW. * The last `vA` replica (the old master) is stopped and started as `vB` replica. This way a replicaset is accessible for read requests all the time and accessible for write requests almost all the time -- except the master switch step. If a downtime is acceptable for the given replicaset (typical for a dataless router if requests are balanced over several of them), the steps are simpler: * Stop `vA` instance(s). * Start `vB` instance(s) using existing snapshot(s). > [!NOTE] > This procedure should be repeated for all the replicasets in the cluster.Now the service works on a new tarantool version, but the next step sometimes required to unblock all the new features of the new tarantool version.
Upgrade the database schema
For each replicaset:
box.schema.upgrade()
and thenbox.snapshot()
on master.box.schema.upgrade()
(compare vclocks[^1]).box.snapshot()
.[^1]: Should we take care to https://github.com/tarantool/tarantool/issues/10142 in this case?
Now all the features of the new tarantool version are unblocked.
Dowgrade the database schema
Same as the upgrade, but
box.schema.downgrade()
should be called instead ofbox.schema.upgrade()
.Restart using an old tarantool executable
The same as restart using the new tarantool version, but new versions are stopped and old ones are started.
Proposal
It is a bit complicated to perform all the required operations for the database schema upgrade/downgrade manually on masters and replicas in the described order and with appropriate waiting for synchronizing.
It would be nice to have some automation that performs it, reports the progress and the accumulated result.
My proposal is to implement the steps from the 'upgrade the database schema' section above as a
tt
command. We may also need a companion command that performs the downgrade and a command that reports a state of the cluster in regards to the database schema version.If we do, it would be also nice to update our upgrade documentation with a suggestion to use the commands instead of manual steps.
The nice part is that (as far as I understand)
tt
already knows a list of instances in the cluster, how they're grouped into replicasets (at least in case of the declarative configuration from tarantool 3.x), knows how to determine an RW and an RO instance and able to perform commands on a local and a remote instance. It looks like a framework for implementing such administration scenarios.(After some discussions we agreed to move remote instances support to a separate step, see #968. It is still in our plans, just we'll do it separately.)
Also tracked in TNTP-363 Part of TNTP-41