AliyunContainerService / gpushare-scheduler-extender

GPU Sharing Scheduler for Kubernetes Cluster
Apache License 2.0
1.39k stars 308 forks source link

policy-config-file is no longer supported by kubernetes starting by v1.23 #166

Closed jeffguorg closed 2 years ago

jeffguorg commented 2 years ago

from installation instruction:

Add Policy config file parameter in scheduler arguments - --policy-config-file=/etc/kubernetes/scheduler-policy-config.json

but the option changed since v1.23, referring to kubernetes document: https://kubernetes.io/docs/reference/scheduling/policies/ instead we could use --config=KubeSchedulerConfiguration

and the config file has a apiVersion of kube-scheduler-config.v1beta3, for example:

# /etc/kubernetes/scheduler-policy-config.yaml
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: /etc/kubernetes/scheduler.conf
extenders:
- urlPrefix: "http://127.0.0.1:32766/gpushare-scheduler"
  filterVerb: filter
  bindVerb: bind
  enableHTTPS: false
  nodeCacheCapable: false
  managedResources:
  - name: aliyun.com/gpu-mem
    ignoredByScheduler: false
  ignorable: false

and

- --config=/etc/kubernetes/scheduler-policy-config.yaml

tested on my machine and works well

noranraskin commented 2 years ago

Could you go a little bit more into detail how you got it to work? I'm running v1.23 too but I don't seem to get this going. Did you do a
kubectl apply -f /etc/kubernetes/scheduler-policy-config.yaml
or did you just pass it to the kube-scheduler? Could you share your kube-scheduler.yaml? When I tried to apply your scheduler-policy-config.yaml I got this:

error: unable to recognize "scheduler-policy-config.yaml": no matches for kind "KubeSchedulerConfiguration" in version "kubescheduler.config.k8s.io/v1alpha1"

When I add - --config-file=/etc/kubernetes/scheduler-policy-config.yaml to the arguments in the kube-scheduler.yaml it results in a crash loop.

jeffguorg commented 2 years ago

@noranraskin

Could you go a little bit more into detail how you got it to work? ... When I add - --config-file=/etc/kubernetes/scheduler-policy-config.yaml to the arguments in the kube-scheduler.yaml it results in a crash loop.

sure, but would you please show me your config and paste the output of kube-scheduler to pastebin? Others wouldn't know what the problem is, if you don't tell us what is happening.

Did you do a kubectl apply -f /etc/kubernetes/scheduler-policy-config.yaml or did you just pass it to the kube-scheduler?

I was running the single node cluster on my desktop and installed gpushare for testing, but accidentally reinstalled the operating system. everything is recalling from my m> kubectl apply -f /etc/kubernetes/scheduler-policy-config.yamlemory, so the steps may not be so accurate.

I did all the instructions in gpushare's installation guide, except for 1. Deploy GPU share scheduler extender in control plane and 2. Modify scheduler configuration.

Instead

noranraskin commented 2 years ago

This is my kube-scheduler.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: kube-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
    - --config-file=/etc/kubernetes/scheduler-policy-config.yaml
    image: k8s.gcr.io/kube-scheduler:v1.23.3
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler.conf
      name: kubeconfig
      readOnly: true
    - mountPath: /etc/kubernetes/scheduler-policy-config.yaml
      name: scheduler-policy-config
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler.conf
      type: FileOrCreate
    name: kubeconfig
  - hostPath:
      path: /etc/kubernetes/scheduler-policy-config.yaml
      type: FileOrCreate
    name: scheduler-policy-config
status: {}

And this is the what I get from kubectl describe pod kube-scheduler-k8s-master -n kube-system after moving kube-scheduler.yaml to /etc/kubernetes/manifests

Name:                 kube-scheduler-k8s-master
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 k8s-master/192.168.0.10
Start Time:           Sat, 26 Feb 2022 23:40:44 +0100
Labels:               component=kube-scheduler
                      tier=control-plane
Annotations:          kubernetes.io/config.hash: 240586c2f69e8d08e4fa3deca38c790b
                      kubernetes.io/config.mirror: 240586c2f69e8d08e4fa3deca38c790b
                      kubernetes.io/config.seen: 2022-02-27T23:57:17.676545418Z
                      kubernetes.io/config.source: file
                      seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:               Running
IP:                   192.168.0.10
IPs:
  IP:           192.168.0.10
Controlled By:  Node/k8s-master
Containers:
  kube-scheduler:
    Container ID:  docker://16fadb089232461ee2dc8b3c73e9c633b281f9efcc35f0b55c91d81d6eb4bf4d
    Image:         k8s.gcr.io/kube-scheduler:v1.23.3
    Image ID:      docker-pullable://k8s.gcr.io/kube-scheduler@sha256:32308abe86f7415611ca86ee79dd0a73e74ebecb2f9e3eb85fc3a8e62f03d0e7
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-scheduler
      --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
      --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
      --bind-address=127.0.0.1
      --kubeconfig=/etc/kubernetes/scheduler.conf
      --leader-elect=true
      --config-file=/etc/kubernetes/scheduler-policy-config.yaml
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 28 Feb 2022 01:03:14 +0100
      Finished:     Mon, 28 Feb 2022 01:03:14 +0100
    Ready:          False
    Restart Count:  6
    Requests:
      cpu:        100m
    Liveness:     http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=8
    Startup:      http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/kubernetes/scheduler-policy-config.yaml from scheduler-policy-config (ro)
      /etc/kubernetes/scheduler.conf from kubeconfig (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kubeconfig:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/scheduler.conf
    HostPathType:  FileOrCreate
  scheduler-policy-config:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/scheduler-policy-config.yaml
    HostPathType:  FileOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type     Reason   Age                   From     Message
  ----     ------   ----                  ----     -------
  Normal   Created  8m53s (x10 over 24h)  kubelet  Created container kube-scheduler
  Normal   Started  8m53s (x10 over 24h)  kubelet  Started container kube-scheduler
  Normal   Pulled   8m7s (x11 over 24h)   kubelet  Container image "k8s.gcr.io/kube-scheduler:v1.23.3" already present on machine
  Warning  BackOff  4m30s (x64 over 24h)  kubelet  Back-off restarting failed container

I did everything exactly like you described it. I also just copied you scheduler-policy-config.yaml. Passing a GPU to a container works with the default nvidia-device-plugin (which I did uninstall like mentioned in the install guide). Unfortunately these errors don't say a lot... Do I need to run a different image?

jeffguorg commented 2 years ago

@noranraskin try kubectl logs -n kube-system kube-scheduler-k8s-master what does kube-scheduler complaint? does it recognize the config? is it capable of connect to gpushare?

noranraskin commented 2 years ago

'Error: unknown flag: --config-file' I am running kubernetes v1.23.3 though, setup using kubeadm. Could I be missing some api extensions? And if yes how can I check this?

jeffguorg commented 2 years ago

@noranraskin sorry it might be my fault. It should be --config. The steps above have been updated

noranraskin commented 2 years ago

That fixed it! Cheers mate

jonn-yan commented 2 years ago

This configuration cannot be applied to k3s v1.23.5+k3s1 version, apply error : error: unable to recognize "scheduler-policy-config.yaml": no matches for kind "KubeSchedulerConfiguration" in version "kubescheduler.config.k8s.io/v1beta2"

please help me ! This is my scheduler-policy-config.yaml :

apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: /var/lib/rancher/k3s/server/cred/scheduler.kubeconfig
extenders:
- urlPrefix: "http://127.0.0.1:32766/gpushare-scheduler"
  filterVerb: filter
  bindVerb: bind
  enableHTTPS: false
  nodeCacheCapable: true
  managedResources:
  - name: aliyun.com/gpu-mem
    ignoredByScheduler: false
  ignorable: false

In the k3s v1.20 version, I used the following configuration, which can run normally, but not now.

ExecStart=/usr/local/bin/k3s \
    server \
    --kube-scheduler-arg="v=99" \
    --kube-scheduler-arg="policy-config-file=/etc/kubernetes/scheduler-policy-config.json" \
    --kube-scheduler-arg="leader-elect=true"

This is my scheduler-policy-config.json :

{
  "kind": "Policy",
  "apiVersion": "v1",
  "extenders": [
    {
      "urlPrefix": "http://127.0.0.1:32766/gpushare-scheduler",
      "filterVerb": "filter",
      "bindVerb":   "bind",
      "enableHttps": false,
      "nodeCacheCapable": true,
      "managedResources": [
        {
          "name": "aliyun.com/gpu-mem",
          "ignoredByScheduler": false
        }
      ],
      "ignorable": false
    }
  ]
}
jeffguorg commented 2 years ago

@jonn-yan Hi, I'm not sure if this is compatible with k3s 1.23. I assume it is.

This file is used as a configuration file, replacing the former json configuration file. It's not an API resource to apply. Thus you should

不大确定是不是和k3s兼容,就假设它兼容吧。

这是一个配置文件,主要是替换掉了之前的json配置文件,不是用来apply的。也就是说你得:

jonn-yan commented 2 years ago

@jeffguorg Thank you very much, I verified no problem .