New Plugin Interface for custom hooks around Executor API methods

mcmadhan01 commented 5 years ago

Background

DCE's current plugin mechanism is pod lifecycle centric, and allows custom extensions to be plugged in around the steps needed to launch a pod (i.e. custom plugins can be added pre/post of image pull, compose up steps)

Requirement

There are situations that demand custom logic executions outside of pod lifecycle, and aligned with the executor's API functions. For example, we have a requirement to execute some custom logic to post error related metrics whenever the the control exits LaunchTask implementation of dce-go.

This may be applicable for other executor API methods too but scope of this issue could deal with LaunchTask and allow extensibility for all API methods.

Proposed Design

Executor Hook

ExecutorHook allows custom implementations to be plugged post execution of Docker Compose Mesos Executor's API methods

Here is a representation of proposed ExecutorHook interface

type ExecutorHook interface {
// PostExec is invoked  post execution of Docker Compose Mesos Executor's lifecycle function
PostExec(taskInfo *mesos.TaskInfo) error
// BestEffort is invoked in case a PostExec returned an error and are expected to return a bool to indicate
// if the execution needs to continue with the next available hook or not
BestEffort(execPhase string) bool
}

Config for executor hook is defined based on the method of the executor API as shown below. For current requirement explained above, supporting post execution of LaunchTask is only needed. But there are possibilities to introduce "Pre" hooks and for a different API method of the executor so the config structure is designed to support that.

...
execHooks:
   LaunchTask:
      Post: ["hook1", "hook2"]
...

So, on exit from LaunchTask, most definite thing that is done in the current DCE is to send status to mesos. So to perform the post executions, we can introduce a task status channel and have the hooks executed based on various status changes.

mbdas commented 5 years ago

Why it cannot be accommodated around task/pod lifecycle? Executor is nothing but managing the lifecycle of a task with launch and kill being the primary lifecycle operations. Dce today supports prelaunch and postKill hooks and so postLaunch can be added in same fashion.

mcmadhan01 commented 5 years ago

Why it cannot be accommodated around task/pod lifecycle? Executor is nothing but managing the lifecycle of a task with launch and kill being the primary lifecycle operations. Dce today supports prelaunch and postKill hooks and so postLaunch can be added in same fashion.

There is a postlaunchtask plugin method already available - https://github.com/paypal/dce-go/blob/develop/plugin/type.go#L28 but this is not suitable for some needs where you want a generic hook to execute no matter what. For example, I want to post some metrics (error, time taken etc) when the control exits LaunchTask but with current implementation, any failure in steps before pod launch (failures may be from image pull, preimage/postimage plugins etc) fail the task early, and never get to execute postlaunchtask, which is executed after launch of the pod in the current impl.

Also I have been thinking a mechanism to have hooks on task status transition (ie. hooks executed on task transition from starting -> running or failed, running -> failed) may make sense. Hope to hear some more point of views on this.

mbdas commented 5 years ago

Why not change the current implementation ? Having 2 sets of hooks will become confusing where to add what. I will go over the PR to see what can be consolidated. But most important thing, mesos agent expects the call back methods to finish fast because it cannot invoke any other method if blocked on existing call. So we have to ensure launchTask returns and the hooks execute async.

Regarding task transition , that happens on the mesos master. Agent may help send some task status through executor, some are embedded inside agent, some may be like lost added by master itself.

mcmadhan01 commented 5 years ago

Why not change the current implementation ? Having 2 sets of hooks will become confusing where to add what.

Original idea was to not impact the pod life cycle but allow something like pre/post of any executor API. It will break backward incompatibility for anyone with custom plugin implementations. Also the requirements are such that we want to execute these hooks no matter what happens on the pod life cycle steps. If two sets of hooks around task/pod may lead to confusion, then I think that the best approach to allow a custom hook post ExecutorAPIs would be to listen on task transitions and react. More on that below.

Regarding task transition , that happens on the mesos master. Agent may help send some task status through executor, some are embedded inside agent, some may be like lost added by master itself.

By task status transition, I meant the status update detected by DCE/Custom Plugins - to mark a pod failed, running, which in turn translates to a task status update to Mesos. This is where we can achieve a hook mechanism. Please share your thoughts.

But most important thing, mesos agent expects the call back methods to finish fast because it cannot invoke any other method if blocked on existing call. So we have to ensure launchTask returns and the hooks execute async.

Yes, it is not a blocking execution in the current PR #76 impl. It is a goroutine that listens on task status but tied just to LaunchTask. With my proposal above, this can be made generic to all Executor API methods.

mcmadhan01 commented 5 years ago

Summarizing the offline discussions with @mbdas

New Hook introduced in https://github.com/paypal/dce-go/pull/76 will be changed from Executor API centric to pod status centric as the current implementation is based on task status update and does not have context on post/pre of an executor step just with pod status
So the new hook will be made as pod status centric. This will be called as PodStatusHook
config section for hooks that currently expects hooks defined based on post and pre of Executor API will change. Hooks will now be defined based on pod status.
a go routine to listen on pod status channel will be started from DCE's LaunchTask, as we plan to handle pod status hook on LaunchTask. This will be implemented for task status changes on KillTask or other executor API method later.

paypal / dce-go

New Plugin Interface for custom hooks around Executor API methods #75