Automated logviewer launch on any worker hosts running Storm bits (1.x specific)
State of these logviewer daemons are tracked per host by stuffing state into ZooKeeper
Reconciliation is also implemented to ensure that logviewers are brought back alive if ever killed.
ZKClient.java
Wrapper around Apache Curator in order to interact with ZK
Can create, delete, and check for existence of nodes in ZK
Used to track which hosts are running logviewer daemons by tracking that state into ZK
Junit tests are written for this class
MesosNimbus.java
Added TimerTask to perform implicit reconciliation every 5 minutes
Implicit reconciliation reconciles that state of every running task between the framework scheduler and master (sends update of task status to statusUpdate method in NimbusMesosScheduler)
launchLogviewer method called in allSlotsAvailableForScheduling loops through the existingSupervisors and schedules the logviewer daemon as Mesos tasks based on the offers available on each host
Checks ZK if logviewer already running on the host which the supervisor is running on, as such, ensures that logviewer only runs on hosts with Storm bits
StormSchedulerImpl.java
Added an offersRequestTracker that tracks if any storm tasks (topologies) or sidecar tasks need offers. This ensures that offers are only suppressed when no tasks need them.
NimbusMesosScheduler.java
Updated statusUpdate method to look for logviewer tasks to check if they need to be relaunched
updateLogviewerState method updates the state of the logviewer task in ZK if needed
checkRunningLogviewerState method checks if a running Mesos tasks exists for logviewer that isn't tracked in ZK and updates ZK state if necessary
MesosCommon.java
Added getMesosFrameworkName as an utility to get the framework name from the storm.yaml
Modified supervisorId method to format the ID with the framework name
Notes
ZKClient.java
MesosNimbus.java
StormSchedulerImpl.java
NimbusMesosScheduler.java
MesosCommon.java