kubernetes-sigs / kueue

Kubernetes-native Job Queueing
https://kueue.sigs.k8s.io
Apache License 2.0
1.5k stars 268 forks source link

ResourceFlavor nodeLabels not added to .nodeSelector field on Pods #931

Closed jtorrex closed 1 year ago

jtorrex commented 1 year ago

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

apiVersion: kueue.x-k8s.io/v1beta1
kind: ResourceFlavor
metadata:
  name: "gpu-flavor"
spec:
  nodeLabels:
    karpenter.k8s.aws/instance-gpu-count: "1"
    karpenter.k8s.aws/instance-gpu-name: t4
---
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
  name: "gpu-clusterqueue"
spec:
  namespaceSelector: {} # match all.
  resourceGroups:
  - coveredResources: ["cpu", "memory"]
    flavors:
    - name: "gpu-flavor"
      resources:
      - name: "cpu"
        nominalQuota: 16
      - name: "memory"
        nominalQuota: 32Gi
---
apiVersion: kueue.x-k8s.io/v1beta1
kind: LocalQueue
metadata:
  namespace: "kueue-system"
  name: "gpu-queue"
spec:
  clusterQueue: "gpu-clusterqueue"
apiVersion: batch/v1
kind: Job
metadata:
  namespace: kueue-system
  name: sample-job-gpu
  annotations:
    kueue.x-k8s.io/queue-name: gpu-queue
spec:
  parallelism: 3
  completions: 3 
  template:
    spec:
      containers:
      - name: dummy-job
        image: gcr.io/k8s-staging-perf-tests/sleep:latest
        args: ["3600s"]
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"
      restartPolicy: Never
$ kubectl create -f sample-job-gpu.yaml 
job.batch/sample-job-gpu created

Anything else we need to know?:

Environment:

System Info:
  Kernel Version:              5.4.226-129.415.amzn2.x86_64
  OS Image:                    Amazon Linux 2
  Operating System:            linux
  Architecture:                amd64
  Container Runtime Version:   containerd://1.6.6
  Kubelet Version:             v1.24.7-eks-fb459a0
  Kube-Proxy Version:          v1.24.7-eks-fb459a0
namespace: kueue-system
resources:
- https://github.com/kubernetes-sigs/kueue/releases/download/v0.3.2/manifests.yaml
- ondemand-clusterqueue-setup.yaml
- spot-clusterqueue-setup.yaml
- gpu-clusterqueue-setup.yaml
configMapGenerator:
- namespace: kueue-system
  name: kueue-manager-config
  behavior: replace
  files:
  - controller_manager_config.yaml
patches:
  - path: kueue-karpenter-patch.yaml
    target:
      group: admissionregistration.k8s.io
      name: kueue-mutating-webhook-configuration
      kind: MutatingWebhookConfiguration
      version: v1

Controller configuration modified to allow the MPIJob framework (controller_manager_config.yaml):

apiVersion: config.kueue.x-k8s.io/v1beta1
kind: Configuration
health:
  healthProbeBindAddress: :8081
metrics:
  bindAddress: :8080
webhook:
  port: 9443
leaderElection:
  leaderElect: true
  resourceName: c1f6bfd2.kueue.x-k8s.io
controller:
  groupKindConcurrency:
    Job.batch: 5
    LocalQueue.kueue.x-k8s.io: 1
    ClusterQueue.kueue.x-k8s.io: 1
    ResourceFlavor.kueue.x-k8s.io: 1
    Workload.kueue.x-k8s.io: 1
clientConnection:
  qps: 50
  burst: 100
#waitForPodsReady:
#  enable: true
#manageJobsWithoutQueueName: true
#namespace: ""
#internalCertManagement:
#  enable: false
#  webhookServiceName: ""
#  webhookSecretName: ""
integrations:
  frameworks:
  - "kubeflow.org/mpijob"

Karpenter patch to avoid interfering with other namespaces when the cluster downscales (kueue-karpenter-patch.yaml):

- op: replace
  path: /webhooks/0/namespaceSelector
  value:
    matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: NotIn
      values:
      - karpenter
tenzen-y commented 1 year ago

It looks like batch/job is dropped. Can you update controller_manager_config like this?

apiVersion: config.kueue.x-k8s.io/v1beta1
kind: Configuration
health:
  healthProbeBindAddress: :8081
metrics:
  bindAddress: :8080
webhook:
  port: 9443
leaderElection:
  leaderElect: true
  resourceName: c1f6bfd2.kueue.x-k8s.io
controller:
  groupKindConcurrency:
    Job.batch: 5
    LocalQueue.kueue.x-k8s.io: 1
    ClusterQueue.kueue.x-k8s.io: 1
    ResourceFlavor.kueue.x-k8s.io: 1
    Workload.kueue.x-k8s.io: 1
clientConnection:
  qps: 50
  burst: 100
#waitForPodsReady:
#  enable: true
#manageJobsWithoutQueueName: true
#namespace: ""
#internalCertManagement:
#  enable: false
#  webhookServiceName: ""
#  webhookSecretName: ""
integrations:
  frameworks:
    - "kubeflow.org/mpijob"
+   - "batch/job"
alculquicondor commented 1 year ago

/close user error :)

k8s-ci-robot commented 1 year ago

@alculquicondor: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kueue/issues/931#issuecomment-1613529917): >/close >user error :) Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.