ProvisioningRequestNotSchedulableInNodepool for Pod with Generic Ephemeral Volume

mruoss commented 2 months ago

What happened:

We are using Kueue with GKE's Dynamic Workload Scheduler. It works great but as soon as we try to attach a generic ephemeral volume to the pod, it won't get provisioned anymore.

Workload and a provisioning request are created. However, after a while, the provisioning request is marked as failed. Its status looks as follows. Note the error message Provisioning Request's pods cannot be scheduled in the nodepool, affected nodepools: **REDACTED_LIST_OF_NODEPOOLS**.

Status:
  Conditions:
    Last Transition Time:  2024-06-10T11:17:32Z
    Message:               Provisioning Request wasn't accepted.
    Observed Generation:   1
    Reason:                NotAccepted
    Status:                False
    Type:                  Accepted
    Last Transition Time:  2024-06-10T11:17:32Z
    Message:               Provisioning Request wasn't provisioned.
    Observed Generation:   1
    Reason:                NotProvisioned
    Status:                False
    Type:                  Provisioned
    Last Transition Time:  2024-06-10T11:19:32Z
    Message:               Provisioning Request's pods cannot be scheduled in the nodepool, affected nodepools: **REDACTED_LIST_OF_NODEPOOLS**
    Observed Generation:   1
    Reason:                ProvisioningRequestNotSchedulableInNodepool
    Status:                True
    Type:                  Failed
Events:                    <none>

What you expected to happen:

If I remove the volume or create a PVC and replace the volume declaration with something like the following, a node gets provisioned and the pod gets scheduled. I would expect the same behaviour for pods with generic ephemeral volumes.

volumes:
  - name: tmp-fs
    persistentVolumeClaim:
      claimName: myclaim

How to reproduce it (as minimally and precisely as possible):

This is the Pod we're trying to get scheduled on a fresh nvidia-l4 machine:

apiVersion: v1
kind: Pod
metadata:
  labels:
    kueue.x-k8s.io/queue-name: dws-local-queue
  name: kueue-test
  namespace: development
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: cloud.google.com/gke-accelerator
            operator: In
            values:
            - nvidia-l4
  containers:
  - name: main-container
    image: ubuntu:latest
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    resources:
      limits:
        cpu: "4"
        memory: 32Gi
        nvidia.com/gpu: "1"
      requests:
        cpu: "4"
        memory: 32Gi
        nvidia.com/gpu: "1"
    volumeMounts:
    - mountPath: /tmp
      name: tmp-fs
  tolerations:
  - effect: NoSchedule
    key: nvidia.com/gpu
    operator: Exists
  - effect: NoSchedule
    key: cloud.google.com/gke-queued
    operator: Equal
    value: "true"
  volumes:
  - name: tmp-fs
    ephemeral:
      volumeClaimTemplate:
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 32Gi
          storageClassName: my-job-ephemeral-storage
          volumeMode: Filesystem

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): v1.29.4-gke.1043002
Kueue version (use git describe --tags --dirty --always): v0.6.2
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

trasc commented 2 months ago

This looks to be related to cluster-autoscaler/provisioningrequest.

alculquicondor commented 2 months ago

Hi @mruoss, indeed Kueue is not responsible for satisfying the ProvisioningRequest.

In addition to the issue in cluster-autoscaler, I would suggest you reach out to your Google Cloud representative to file a feature request.

/close

k8s-ci-robot commented 2 months ago

@alculquicondor: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kueue/issues/2389#issuecomment-2189984200): >Hi @mruoss, indeed Kueue is not responsible for satisfying the ProvisioningRequest. > >In addition to the issue in cluster-autoscaler, I would suggest you reach out to your Google Cloud representative to file a feature request. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / kueue

ProvisioningRequestNotSchedulableInNodepool for Pod with Generic Ephemeral Volume #2389