alibaba / clusterdata

cluster data collected from production clusters in Alibaba for cluster management research
1.59k stars 407 forks source link

Resource usage for task can be higher than resources requested #31

Open jcarreira opened 6 years ago

jcarreira commented 6 years ago

There are many tasks in the dataset that utilize more resources than what was requested.

For instance, job_id:10771 task_id:66551 has plan_cpu:0.75 [1] from the following entry in _batchtask.csv:

6301,6352,10771,66551,137,Terminated,75,0.01600704061294748

However, this task utilizes 7.66 (Max) and 0.99 (average) CPU as can be seen in batch_instance.csv:

6302,6339,10771,66551,427,Terminated,1,1,7.66,0.99,0.019309916392721248,0.012926772448424922

Can you clarify if the amount of resources used by tasks can be higher than the amount of resources requested? If not, what can explain these numbers?

Can I interpret the amount of resources requested as resources allocated by the scheduler?

[1] I divided the _plancpu value by 100 as explained in this issue: https://github.com/alibaba/clusterdata/issues/11

HaiyangDING commented 6 years ago

Yes, the actual resource used can be larger than that the task has requested. This is an example of over-subscription.

As for (some of) the batch tasks, it can be more aggressive (the one you found is one example). For the batch tasks that are of low priority, they don't actually hold any resource of the host on which they are running. Although they don't hold any resource, we have mechanism to allow them to use the unused resource on the host while other tasks are not busy. These resources, belonging to other tasks, can be reclaimed by their owners when needed. In a word, tasks of low priority do not have any guarantee on the amount of resource they can use, but they can use the resource of other tasks provided the owner of the resources are not using them.

So, to answer your second question: Can I interpret the amount of resources requested as resources allocated by the scheduler?

For the low priority tasks, the scheduler does not have to really allocate any resource (i.e. a low priority task says it requests 0.25CPU, when it is assigned to Node-A, Node-A's available CPU does NOT need to minus 0.25CPU).

The actual mechanism is a bit complicated (oversubscription, preemption, priority, resource isolation, etc...) than explained above, but the idea is there.

Violet-Guo commented 5 years ago

Yes, the actual resource used can be larger than that the task has requested. This is an example of over-subscription.

As for (some of) the batch tasks, it can be more aggressive (the one you found is one example). For the batch tasks that are of low priority, they don't actually hold any resource of the host on which they are running. Although they don't hold any resource, we have mechanism to allow them to use the unused resource on the host while other tasks are not busy. These resources, belonging to other tasks, can be reclaimed by their owners when needed. In a word, tasks of low priority do not have any guarantee on the amount of resource they can use, but they can use the resource of other tasks provided the owner of the resources are not using them.

So, to answer your second question: Can I interpret the amount of resources requested as resources allocated by the scheduler?

For the low priority tasks, the scheduler does not have to really allocate any resource (i.e. a low priority task says it requests 0.25CPU, when it is assigned to Node-A, Node-A's available CPU does NOT need to minus 0.25CPU).

The actual mechanism is a bit complicated (oversubscription, preemption, priority, resource isolation, etc...) than explained above, but the idea is there.

I have a question about trace v2018.

In the trace v2018, the table batch_task.csv has the plan_cpu and plan_mem for batch task. As you mentioned, is this mechanism that scheduler does not really allocate any resource to batch job still work in the trace v2018? @HaiyangDING

HaiyangDING commented 5 years ago

Yes, the actual resource used can be larger than that the task has requested. This is an example of over-subscription. As for (some of) the batch tasks, it can be more aggressive (the one you found is one example). For the batch tasks that are of low priority, they don't actually hold any resource of the host on which they are running. Although they don't hold any resource, we have mechanism to allow them to use the unused resource on the host while other tasks are not busy. These resources, belonging to other tasks, can be reclaimed by their owners when needed. In a word, tasks of low priority do not have any guarantee on the amount of resource they can use, but they can use the resource of other tasks provided the owner of the resources are not using them. So, to answer your second question: Can I interpret the amount of resources requested as resources allocated by the scheduler? For the low priority tasks, the scheduler does not have to really allocate any resource (i.e. a low priority task says it requests 0.25CPU, when it is assigned to Node-A, Node-A's available CPU does NOT need to minus 0.25CPU). The actual mechanism is a bit complicated (oversubscription, preemption, priority, resource isolation, etc...) than explained above, but the idea is there.

I have a question about trace v2018.

In the trace v2018, the table batch_task.csv has the plan_cpu and plan_mem for batch task. As you mentioned, is this mechanism that scheduler does not really allocate any resource to batch job still work in the trace v2018? @HaiyangDING

Yes, the mechanism remains in v2018.