mlowicki / rhythm

Time-based job scheduler for Apache Mesos
MIT License
29 stars 2 forks source link

Follow multi-scheduler scalability guidelines #9

Closed mlowicki closed 6 years ago

mlowicki commented 6 years ago

https://mesos.apache.org/documentation/latest/app-framework-development-guide/#multi-scheduler-scalability

In nutshell:

1. Use Suppress
2. Do not hold onto offers
3. Decline resources using a large timeout
4. Do not REVIVE frequently
mlowicki commented 6 years ago

Re 3 we can use calls.RefuseSeconds and sample use is on https://github.com/mesos/mesos-go/blob/29de6ff97b48c29cb5ac07029ed75186e5ba0eed/api/v1/cmd/msh/msh.go#L229

mlowicki commented 6 years ago

2 is already done since while handling offer list of runnable jobs is retrieved (response from ZK) and then tasks launched so unused resources are treated as declined.

mlowicki commented 6 years ago

Idea is to have entity called OffersController which:

There should be metrics for number of REVIVE and SUPPRESS calls.

mlowicki commented 6 years ago

Fixed by https://github.com/mlowicki/rhythm/commit/8e79dd42016eaa9e9686918fc5a3cea82b3b5504.