Open nathanklick opened 11 months ago
solo
should allow user to take a backup of the state. solo
does not need to support other storage like S3 buckets now.solo
should store the state across all clusters
solo
should acquire lease all cluster before taking action (only the majority lock owner wins, and others should release locks). Locks should have expiry (~180s) so that there is no dead-lock.solo
should provide useful details if it cannot acquire the locks and tell user to wait or retry again.
Description
Solo currently stores all CLI argument values used to initialize the last network deployment in the user's home directory under the
.solo/solo.config
file.This is brittle and does not scale across multiple engineers and staff members. Additionally, the current configuration file only stores the most recently used CLI options and values. This does not work well when managing multiple deployments which may be configured differently.
State
State of a given deployment should consist of the values which drive deployment decisions made by Solo. Also the state of a given deployment should describe critical properties of the actual deployment, such as the number of consensus nodes deployed and the features enabled.
Certain command line options and other values which should not be include in state are things like logging levels of the Solo tool or other user specific values. Additionally, values which can be directly derived from the actual K8S deployment and manifests such as Helm values files should not be included in the state.
Storage
The state of a given deployment should be stored in the Kubernetes cluster alongside the actual deployment and within the deployment's namespace. The state may be stored a K8S configmap, secrets, or instance of a CRD.
Locking
The operations for a given deployment must be limited to a single engineer/user at a time. Therefore, locking should be implemented using the K8S cluster and deployment namespace. The lock should be implemented as Custom Resource Definition (CRD) and actual locks created using versioned instances of the CRD.