xaviertintin / Thesis

Integrating Kubernetes Batch Scheduling Systems for Containerized Declarative Data Analyses
MIT License
1 stars 0 forks source link

Demonstrate Novelty #13

Closed xaviertintin closed 3 months ago

xaviertintin commented 5 months ago

Demonstrate the novelty of the approach by exposing a certain Kueue feature (e.g. resource groups or queue borrowing) that is not possible with the classical K8s Job API approach.

xaviertintin commented 3 months ago

Kueue borrowing capabilities are functional within the REANA Deployment, it is all defined in the borrowingLimit of the ClusterQueue configuration.

# Run Batch Job
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
  name: "cluster-queue-reana-run-batch"
spec:
  namespaceSelector: {} 
  cohort: "reana"
  resourceGroups:
  - coveredResources: ["cpu", "memory"]
    flavors:
    - name: "default-flavor"
      resources:
      - name: "cpu"
        nominalQuota: 2
        borrowingLimit: 0 # blocks this from borrowing resources from another ClusterQueue
      - name: "memory"
        nominalQuota: 4Gi
---
# Run Job
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
  name: "cluster-queue-reana-run-job"
spec:
  namespaceSelector: {} 
  cohort: "reana"
  resourceGroups:
  - coveredResources: ["cpu", "memory"]
    flavors:
    - name: "default-flavor"
      resources:
      - name: "cpu"
        nominalQuota: 5
        borrowingLimit: 0 # blocks this from borrowing resources from another ClusterQueue
      - name: "memory"
        nominalQuota: 7Gi
xaviertintin commented 3 months ago

Novelty and Improvements

  1. Dynamic and Flexible Scheduling: The use of Kueue's ClusterQueue and LocalQueue allows for dynamic and flexible scheduling. Jobs can be queued and scheduled based on real-time resource availability, improving resource utilization.
  2. Scalability: Kueue’s ability to manage resources at the cluster level with ClusterQueue enhances the scalability of your system. As the workload grows, you can adjust the ClusterQueue configurations to allocate more resources or add more nodes with appropriate labels.
  3. Efficiency in Resource Utilization: By defining specific quotas for CPU and memory in ClusterQueue, you ensure that resources are efficiently utilized. This minimizes idle resources and ensures that jobs have the required resources for execution.
  4. Isolation and Prioritization: The clear separation of different types of jobs (infrastructure, batch, main workload) and their dedicated resource quotas helps in isolating workloads and prioritizing critical jobs, thus improving overall system stability and performance.
  5. Enhanced Reproducibility and Consistency: With the structured configuration of resource flavors and queues, you achieve enhanced reproducibility and consistency in how resources are allocated, and jobs are scheduled.