MOTIVATION:
We are going to introduce new config concepts/options. It will definitely change json format and C* storage table format. And when the user upgrades the Scheduler it's json state/c* table should be migrated automatically to the new format.
For that purpose we need to have some simple mechanism to migrate json state
to the newer version (only in forward direction).
The proposed approach is following:
add Scheduler.version: Version field, containing current version.
add version attribute to json state (if absent, we assume - 0.2.1.2 or current on the moment of the implementation);
if Scheduler.version is greater then json state version, we apply a series of migration classes, matching those version interval.
log about applied migrations;
Open question: need to think. May be we will need to support migrations in both direction forward and backward. This will allow the user to downgrade the scheduler, if required.
PROPOSED CHANGE:
NOTE: because C* storage state table are being shared between many DSE frameworks with different namespaces implementation assumes all those frameworks will be restarted simultaneously. Framework that will be started at first will perform data migration, then when next framework starts it doesn't perform migration cause it has been done by prev framework.
NOTE: default version is 0.2.1.2
Scheduler of version 0.2.1.3 is able to migrate state between versions for all types
of storages (file, zk, cassandra) from version 0.2.1.2. For example you are running 0.2.1.2 version and want to update to 0.2.1.3 then you have to stop scheduler (for C* storage stop all running
schedulers that share state table) and then start new scheduler (for C* start schedulers
sequentially, first one will migrate state table, next will just start without migrating
anything).
How C* state migration works:
if table version not exist it will be created, assuming current version is 0.2.1.2
migration for 0.2.1.3 will alter state table
alter table dse_mesos.dse_mesos_framework add cluster_jmx_remote boolean;
alter table dse_mesos.dse_mesos_framework add cluster_jmx_user text;
alter table dse_mesos.dse_mesos_framework add cluster_jmx_password text;
alter table dse_mesos.dse_mesos_framework add node_failover_delay text;
alter table dse_mesos.dse_mesos_framework add node_failover_max_delay text;
alter table dse_mesos.dse_mesos_framework add node_failover_max_tries int;
alter table dse_mesos.dse_mesos_framework add node_failover_failures int;
alter table dse_mesos.dse_mesos_framework add node_failover_failure_time timestamp ;
and also will set default values
update dse_mesos.dse_mesos_framework
set
cluster_jmx_remote = false,
node_failover_delay = '3m',
node_failover_max_delay = '30m',
node_failover_failures = 0
where
namespace = NAMESPACE and
framework_id = FRAMEWORK_ID and
cluster_id = CLUSTER_ID and
node_id = NODE_ID ;
after each migration, version table will be update with corresponding migration version
In order to update version, version table truncated and then inserted latest version value.
How Zookeeper state migration works:
create path if not empty
fetch znode content for given path and parse it as JSON
if version is missing assumed default version 0.2.1.2
applied JSON transformation which will add jmxRemote property (value false), and failover property with delay, maxDelay properties (3m and 30m corresponding)
save updated JSON into znode and update version in JSON
How file state migration works, same as for Zookeeper except stored into file.
NEW OR CHANGED PUBLIC INTERFACES:
trait Migration {
val version: Version
def migrateJson(json: Map[String, Any]): Map[String, Any]
def migrateCassandra(session: Session): Unit
}
MIGRATION PLAN AND COMPATIBILITY:
all changes are compatible
REJECTED ALTERNATIVES:
C* state table:
one table for each version (in order not to force stop other running schedulers, obvious disadvantages difficult to manage, very complicated operations)
one state table but with JSON fields (describing node), unfortunately approach doesn't allow to query data
RESULT: better user experience, scheduler of version 0.2.1.3 is able to migrate state from version 0.2.1.2
MOTIVATION: We are going to introduce new config concepts/options. It will definitely change json format and C* storage table format. And when the user upgrades the Scheduler it's json state/c* table should be migrated automatically to the new format.
For that purpose we need to have some simple mechanism to migrate json state to the newer version (only in forward direction).
The proposed approach is following:
Open question: need to think. May be we will need to support migrations in both direction forward and backward. This will allow the user to downgrade the scheduler, if required.
PROPOSED CHANGE:
NOTE: because C* storage state table are being shared between many DSE frameworks with different namespaces implementation assumes all those frameworks will be restarted simultaneously. Framework that will be started at first will perform data migration, then when next framework starts it doesn't perform migration cause it has been done by prev framework.
NOTE: default version is 0.2.1.2
Scheduler of version 0.2.1.3 is able to migrate state between versions for all types of storages (file, zk, cassandra) from version 0.2.1.2. For example you are running 0.2.1.2 version and want to update to 0.2.1.3 then you have to stop scheduler (for C* storage stop all running schedulers that share state table) and then start new scheduler (for C* start schedulers sequentially, first one will migrate state table, next will just start without migrating anything).
How C* state migration works:
migration for 0.2.1.3 will alter state table
alter table dse_mesos.dse_mesos_framework add cluster_jmx_remote boolean; alter table dse_mesos.dse_mesos_framework add cluster_jmx_user text; alter table dse_mesos.dse_mesos_framework add cluster_jmx_password text;
alter table dse_mesos.dse_mesos_framework add node_failover_delay text; alter table dse_mesos.dse_mesos_framework add node_failover_max_delay text; alter table dse_mesos.dse_mesos_framework add node_failover_max_tries int; alter table dse_mesos.dse_mesos_framework add node_failover_failures int; alter table dse_mesos.dse_mesos_framework add node_failover_failure_time timestamp ;
and also will set default values
update dse_mesos.dse_mesos_framework set cluster_jmx_remote = false, node_failover_delay = '3m', node_failover_max_delay = '30m', node_failover_failures = 0 where namespace = NAMESPACE and framework_id = FRAMEWORK_ID and cluster_id = CLUSTER_ID and node_id = NODE_ID ;
In order to update version, version table truncated and then inserted latest version value.
How Zookeeper state migration works:
How file state migration works, same as for Zookeeper except stored into file.
NEW OR CHANGED PUBLIC INTERFACES:
MIGRATION PLAN AND COMPATIBILITY:
REJECTED ALTERNATIVES:
RESULT: better user experience, scheduler of version 0.2.1.3 is able to migrate state from version 0.2.1.2