spack / spack-gantry

A Dynamic Resource Allocation System for Spack CI and Kubernetes
Other
2 stars 0 forks source link

Cost measurement and analysis #75

Open cmelone opened 4 months ago

cmelone commented 4 months ago

The main goal of this project is to optimize the cost of Spack's CI pipelines. To do this, we need to compute and store the cost of each job to determine if the predictive framework is having a positive impact.

We should have price info for at least a couple weeks to form a baseline.


Objectives

  1. We want to measure the cost of a job's submission and execution on the cluster
  2. Efficiency of resource usage should be quantified to incentivize against wasted cycles

Current Approaches In the Kitware analytics database, they store a "node occupancy" metric, which measures the proportion of a node that was available to a job over the job's life. For instance, if the job was alone on the node, this value would be 1; if there were five other builds happening at the same time, 0.2. This is then multiplied by the cost of the node during the job's life to get a cost per job.

However, it's not a perfect measurement for our application. The cost should be independent of other activity on the node. Without this, it would be impossible to compare cost per job among many samples. The metric is not indicative of the job or spec, simply whether other jobs were running on the node.

While the node occupancy metric is useful for understanding node utilization, it really only optimizes for the success of Karpenter/K8s in packing nodes correctly, which is out of our control. This may be helpful as we investigate scheduling for CI, but not now as we're mostly interested in improving efficiency of resource usage.

Setup

To normalize the cost of resources within instance types, we'll define cost per resource metrics.

\text{Cost per CPU}_i = \frac{C_i}{\text{CPU}_i}
\text{Cost per RAM}_i = \frac{C_i}{\text{RAM}_i}

where

\text{Job Cost} = (\text{CPU}_{\text{request}} \times \text{Cost per CPU}_i + \text{RAM}_{\text{request}} \times \text{Cost per RAM}_i)

where $CPU{\text{request}}$ and $RAM{\text{request}}$ are the resource requests. Rather than including actual usage as a factor in this metric, requests represent the resources reserved by a job on a node. If a build requests 10GB of memory but only uses 5, it should be charged for its allocation, as it prevented other jobs from running on the node.

Using this cost per job metric, jobs are rewarded for minimizing their requests and wall time.

However, we should also measure whether a job is using more/less resources than requested. Underallocation can negatively impact other processes on the node and slow down the build, while overallocation is simply a waste of cycles. In conjunction with the cost per job, a penalty factor would be helpful for understanding the cost imposed on the rest of the cluster or other jobs that could have potentially run on the node.

\text{P}_{\text{CPU}} = \max\left(\frac{1}{\text{UR}_{\text{CPU}}}, \text{UR}_{\text{CPU}}\right)
\text{P}_{\text{RAM}} = \max\left(\frac{1}{\text{UR}_{\text{RAM}}}, \text{UR}_{\text{RAM}}\right)

where

\text{UR}_{\text{CPU}} = \frac{\text{CPU}_{\text{usage avg}}}{\text{CPU}_{\text{request}}}
\text{UR}_{\text{RAM}} = \frac{\text{RAM}_{\text{usage avg}}}{\text{RAM}_{\text{request}}}

We ensure that jobs are penalized for using fewer resources than requested (inverse of utilization ratio) and more resources than requested (utilization ratio, which can be > 1).

Therefore, a "weighted" cost per job would be

(\text{CPU}_{\text{request}} \times \text{Cost per CPU}_i \times \text{P}_{\text{CPU}} + \text{RAM}_{\text{request}} \times \text{Cost per RAM}_i \times \text{P}_{\text{RAM}})


Job cost and $P$ would be stored separately as the former represents "true" cost, while the latter can be used to measure the efficiency of its resource requests via an artificial penalty. When analyzing costs, node instance type should be controlled for because cost per job is influenced by $\text{Cost per CPU}_i$ and $\text{Cost per RAM}_i$, which will vary among instance types.