[Evergreen] Dynamic cluster scheduling

benesch commented 2 years ago

Background

Today, each cluster in Materialize corresponds to a StatefulSet in Kubernetes with largely static constraints, like "place this service in this AZ" or "use this CPU and memory limit."

This works well for customers who want fine-grained control over their infrastructure. It works less well for customers who don't want that control, and want Materialize to just do the right thing by default.

Proposal

We should create a dynamic cluster scheduler that applies flexible policies. E.g.:

Single-replica compute clusters should be placed in the same AZ as environmentd to minimize intra-AZ bandwidth costs. This may require moving replicas when environmentd fails over to a new AZ. (MaterializeInc/cloud#3593)
Replicas of a cluster that do not have an AZ constraint should automatically balance themselves over the available AZs. Right now, the AZ of a replica is assigned at creation, and is not updated when new replicas with constraints are added or removed.
Services that allow autoscaling should size up and down in response to load as necessary.

The scheduler needs to effect all these changes without causing downtime. E.g., when moving a replica between AZs, it should spin up a new replica in the new AZ before terminating the old one.

Outstanding work

cc @jseldess

chuck-alt-delete commented 1 year ago

Trying to clarify — would this capability enable a “self destruct” like capability where someone could create a temporary cluster or temporary source? This kind of functionality could have a big impact for go-to-market strategy

benesch commented 1 year ago

would this capability enable a “self destruct” like capability where someone could create a temporary cluster or temporary source?

Yep, it totally could!

benesch commented 1 year ago

Posting some very loose syntax proposals from a recent Slack conversation on this topic:

-- Create a cluster with automatically managed replicas.
CREATE CLUSTER foo REPLICATION FACTOR 2, SIZE 'medium';

-- Size up the cluster. This automatically spins up a new replica
-- at the new size, waits for it to catch up,
-- and then spins down the old replica.
ALTER CLUSTER foo SIZE 'large';

-- Add a new replica automatically.
ALTER CLUSTER foo REPLICATION FACTOR 3;

-- Turn off the cluster for the night.
ALTER CLUSTER foo REPLICATION FACTOR 0;

-- One day...
CREATE CLUSTER blah;
-- ...will create a cluster that automatically scales up and down in
response to workload.

benesch commented 10 months ago

We have two more specific issues that would require dynamic cluster scheduling:

22568
23132

MaterializeInc / materialize