Netflix / Fenzo

Extensible Scheduler for Mesos Frameworks
700 stars 116 forks source link

Pluggable ENI fitness evaluator #169

Open tbak opened 6 years ago

tbak commented 6 years ago

Feature request

Fenzo supports a concept of preferential named consumable resource, which models a collection of two-level resources. The top level resource is tagged with a name during task placement process, which defines some sort of its runtime profile. Multiple tasks matching the same profile can be associated with the same consumable resource, and be allocated portion of its subresources.

For example, in AWS an ENI and its security group can be modeled as two level resource. The ENI interface models the resource, the subresource is a number of IPs that can be associated with an ENI interface, and the runtime profile is defined by security group(s) associated with an ENI. Tasks with identical security groups placed on the same agent, may thus share single ENI interface until pool of available IPs (sub-resources) is exhausted. When the last task associated with an ENI interface is terminated, its runtime profile becomes undefined again.

As calling AWS API is expensive, it makes sense to reduce the amount of network stack configuration related calls by reusing already provisioned resources. This means Fenzo should promote task placement on an agent/ENI slot which already holds required resources. As Fenzo has limited insight into it (unless a task is already associated with an ENI), we need a pluggable API to externalize this evaluation process.

Implementation proposal

To achieve this goal, two new callback interface are proposed. PreferentialNamedConsumableResourceEvaluator computes fitness score for each valid task/ENI assignments. SchedulingEventListener provides notifications from within the scheduling loop, so newly placed tasks can be accounted for during fitness calculation process.

/**
 * Evaluator for {@link PreferentialNamedConsumableResource} selection process. Given an agent with matching
 * ENI slot (either empty or with a matching name), this evaluator computes the fitness score.
 * A custom implementation can provide fitness calculators augmented with additional information not available to
 * Fenzo for making best placement decision.
 *
 * <h1>Example</h1>
 * {@link PreferentialNamedConsumableResource} can be used to model AWS ENI interfaces together with IP and security
 * group assignments. To minimize number of AWS API calls and to improve efficiency, it is beneficial to place a task
 * on an agent which has ENI profile with matching security group profile so the ENI can be reused. Or if a task
 * is terminated, but agent releases its resources lazily, they can be reused by another task with a matching profile.
 */
public interface PreferentialNamedConsumableResourceEvaluator {

    /**
     * Provide fitness score for an idle consumable resource.
     *
     * @param hostname hostname of an agent
     * @param resourceName name to be associated with a resource with the given index
     * @param index a consumable resource index
     * @param subResourcesNeeded an amount of sub-resources required by a scheduled task
     * @param subResourcesLimit a total amount of sub-resources available
     * @return fitness score
     */
    double evaluateIdle(String hostname, String resourceName, int index, double subResourcesNeeded, double subResourcesLimit);

    /**
     * Provide fitness score for a consumable resource that is already associated with some tasks. These tasks and
     * the current one having profiles so can share the resource.
     *
     * @param hostname hostname of an agent
     * @param resourceName name associated with a resource with the given index
     * @param index a consumable resource index
     * @param subResourcesNeeded an amount of sub-resources required by a scheduled task
     * @param subResourcesUsed an amount of sub-resources already used by other tasks
     * @param subResourcesLimit a total amount of sub-resources available
     * @return fitness score
     */
    double evaluate(String hostname, String resourceName, int index, double subResourcesNeeded, double subResourcesUsed, double subResourcesLimit);
}
/**
 * A callback API providing notification about Fenzo task placement decisions during the scheduling process.
 */
public interface SchedulingEventListener {

    /**
     * Called before a new scheduling iteration is started.
     */
    void onScheduleStart();

    /**
     * Called when a new task placement decision is made (a task gets resources allocated on a server).
     *
     * @param taskAssignmentResult task assignment result
     */
    void onAssignment(TaskAssignmentResult taskAssignmentResult);

    /**
     * Called when the scheduling iteration completes.
     */
    void onScheduleFinish();
}
corindwyer commented 6 years ago

LGTM