adobe-apiplatform / mesos-actor

An Akka based Mesos actor for creating Mesos frameworks.
Apache License 2.0
10 stars 8 forks source link

Task reconciliation after failover #2

Closed tysonnorris closed 6 years ago

tysonnorris commented 6 years ago

also revised task launching and killing in SampleHAFramework

tysonnorris commented 6 years ago

So each instance starts with ClusterSingletonManager.props which will only execute when the single instance is created - the first time it gets created is once the singleton establishes "who is the oldest node in the cluster"; Each node also sends the singleton a Subscribe message, only to retrieve the framework ID; when the singleton host node fails, the new singleton is created with the props on the local host node, which now has the framework ID created from the original singleton. This is basically a simplistic way of sharing the minimal set of data required (framework ID) to reestablish the singleton on the new host. If you have ideas on how to streamline or simplify this, let's discuss; I keep thinking there should be a better way, but haven't surfaced one yet :)