Sherpa is a highly available, fast, and flexible horizontal job scaling for HashiCorp Nomad. It is capable of running in a number of different modes to suit different requirements, and can scale based on Nomad resource metrics or external sources.
Mozilla Public License 2.0
163
stars
8
forks
source link
Add Sherpa cluster leadership allowing HA server deployments. #45
In its current state, Sherpa does not support HA deployments and
relies on operators running a single instance. If two Sherpa
servers are running on the same logic Nomad cluster, each server
will run independently and make scaling decisions and actions.
This can cause problems for the Nomad workloads being scaled and
can also add unrequired load on the Nomad API.
This change adds leadership locking to the Sherpa server. A Sherpa
server with leadership is able to run the policy GC, autoscaler
and respond to all API requests. A Sherpa server that does not
have leadership, and is therefore classed as being a standby, can
only respond to API request to the system and UI paths. Requests
to all other paths will currently result in a HTTP redirect
returned to the client.
In the event of a leadership change, the new active server will
start up the required sub-process and gain the ability to respond
to all API requests. The previous leader, will gracefully stop all
processes a standby should not run, and continue its leader
election loop unless the server is shutting down.
In its current state, Sherpa does not support HA deployments and relies on operators running a single instance. If two Sherpa servers are running on the same logic Nomad cluster, each server will run independently and make scaling decisions and actions. This can cause problems for the Nomad workloads being scaled and can also add unrequired load on the Nomad API.
This change adds leadership locking to the Sherpa server. A Sherpa server with leadership is able to run the policy GC, autoscaler and respond to all API requests. A Sherpa server that does not have leadership, and is therefore classed as being a standby, can only respond to API request to the system and UI paths. Requests to all other paths will currently result in a HTTP redirect returned to the client.
In the event of a leadership change, the new active server will start up the required sub-process and gain the ability to respond to all API requests. The previous leader, will gracefully stop all processes a standby should not run, and continue its leader election loop unless the server is shutting down.
Closes #42