Closed jeffguorg closed 2 years ago
Could you go a little bit more into detail how you got it to work? I'm running v1.23 too but I don't seem to get this going.
Did you do a
kubectl apply -f /etc/kubernetes/scheduler-policy-config.yaml
or did you just pass it to the kube-scheduler
?
Could you share your kube-scheduler.yaml
?
When I tried to apply your scheduler-policy-config.yaml I got this:
error: unable to recognize "scheduler-policy-config.yaml": no matches for kind "KubeSchedulerConfiguration" in version "kubescheduler.config.k8s.io/v1alpha1"
When I add - --config-file=/etc/kubernetes/scheduler-policy-config.yaml
to the arguments in the kube-scheduler.yaml it results in a crash loop.
@noranraskin
Could you go a little bit more into detail how you got it to work? ... When I add - --config-file=/etc/kubernetes/scheduler-policy-config.yaml to the arguments in the kube-scheduler.yaml it results in a crash loop.
sure, but would you please show me your config and paste the output of kube-scheduler to pastebin? Others wouldn't know what the problem is, if you don't tell us what is happening.
Did you do a kubectl apply -f /etc/kubernetes/scheduler-policy-config.yaml or did you just pass it to the kube-scheduler?
I was running the single node cluster on my desktop and installed gpushare for testing, but accidentally reinstalled the operating system. everything is recalling from my m> kubectl apply -f /etc/kubernetes/scheduler-policy-config.yamlemory, so the steps may not be so accurate.
I did all the instructions in gpushare's installation guide, except for 1. Deploy GPU share scheduler extender in control plane and 2. Modify scheduler configuration.
Instead
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /etc/kubernetes/scheduler.conf
extenders:
/etc/kubernetes/scheduler-policy-config.yaml
, or the path you save your KubeSchedulerConfiguration into, instead of /etc/kubernetes/scheduler-policy-config.json. the volume part could be like
```yaml
--config-file=/etc/kubernetes/scheduler-policy-config.yaml
This is my kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
- --config-file=/etc/kubernetes/scheduler-policy-config.yaml
image: k8s.gcr.io/kube-scheduler:v1.23.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
- mountPath: /etc/kubernetes/scheduler-policy-config.yaml
name: scheduler-policy-config
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
- hostPath:
path: /etc/kubernetes/scheduler-policy-config.yaml
type: FileOrCreate
name: scheduler-policy-config
status: {}
And this is the what I get from kubectl describe pod kube-scheduler-k8s-master -n kube-system
after moving kube-scheduler.yaml to /etc/kubernetes/manifests
Name: kube-scheduler-k8s-master
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: k8s-master/192.168.0.10
Start Time: Sat, 26 Feb 2022 23:40:44 +0100
Labels: component=kube-scheduler
tier=control-plane
Annotations: kubernetes.io/config.hash: 240586c2f69e8d08e4fa3deca38c790b
kubernetes.io/config.mirror: 240586c2f69e8d08e4fa3deca38c790b
kubernetes.io/config.seen: 2022-02-27T23:57:17.676545418Z
kubernetes.io/config.source: file
seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status: Running
IP: 192.168.0.10
IPs:
IP: 192.168.0.10
Controlled By: Node/k8s-master
Containers:
kube-scheduler:
Container ID: docker://16fadb089232461ee2dc8b3c73e9c633b281f9efcc35f0b55c91d81d6eb4bf4d
Image: k8s.gcr.io/kube-scheduler:v1.23.3
Image ID: docker-pullable://k8s.gcr.io/kube-scheduler@sha256:32308abe86f7415611ca86ee79dd0a73e74ebecb2f9e3eb85fc3a8e62f03d0e7
Port: <none>
Host Port: <none>
Command:
kube-scheduler
--authentication-kubeconfig=/etc/kubernetes/scheduler.conf
--authorization-kubeconfig=/etc/kubernetes/scheduler.conf
--bind-address=127.0.0.1
--kubeconfig=/etc/kubernetes/scheduler.conf
--leader-elect=true
--config-file=/etc/kubernetes/scheduler-policy-config.yaml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 28 Feb 2022 01:03:14 +0100
Finished: Mon, 28 Feb 2022 01:03:14 +0100
Ready: False
Restart Count: 6
Requests:
cpu: 100m
Liveness: http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=8
Startup: http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=24
Environment: <none>
Mounts:
/etc/kubernetes/scheduler-policy-config.yaml from scheduler-policy-config (ro)
/etc/kubernetes/scheduler.conf from kubeconfig (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubeconfig:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/scheduler.conf
HostPathType: FileOrCreate
scheduler-policy-config:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/scheduler-policy-config.yaml
HostPathType: FileOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 8m53s (x10 over 24h) kubelet Created container kube-scheduler
Normal Started 8m53s (x10 over 24h) kubelet Started container kube-scheduler
Normal Pulled 8m7s (x11 over 24h) kubelet Container image "k8s.gcr.io/kube-scheduler:v1.23.3" already present on machine
Warning BackOff 4m30s (x64 over 24h) kubelet Back-off restarting failed container
I did everything exactly like you described it. I also just copied you scheduler-policy-config.yaml. Passing a GPU to a container works with the default nvidia-device-plugin (which I did uninstall like mentioned in the install guide). Unfortunately these errors don't say a lot... Do I need to run a different image?
@noranraskin try kubectl logs -n kube-system kube-scheduler-k8s-master
what does kube-scheduler complaint? does it recognize the config? is it capable of connect to gpushare?
'Error: unknown flag: --config-file' I am running kubernetes v1.23.3 though, setup using kubeadm. Could I be missing some api extensions? And if yes how can I check this?
@noranraskin sorry it might be my fault. It should be --config. The steps above have been updated
That fixed it! Cheers mate
This configuration cannot be applied to k3s v1.23.5+k3s1 version, apply error : error: unable to recognize "scheduler-policy-config.yaml": no matches for kind "KubeSchedulerConfiguration" in version "kubescheduler.config.k8s.io/v1beta2"
please help me ! This is my scheduler-policy-config.yaml :
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /var/lib/rancher/k3s/server/cred/scheduler.kubeconfig
extenders:
- urlPrefix: "http://127.0.0.1:32766/gpushare-scheduler"
filterVerb: filter
bindVerb: bind
enableHTTPS: false
nodeCacheCapable: true
managedResources:
- name: aliyun.com/gpu-mem
ignoredByScheduler: false
ignorable: false
In the k3s v1.20 version, I used the following configuration, which can run normally, but not now.
ExecStart=/usr/local/bin/k3s \
server \
--kube-scheduler-arg="v=99" \
--kube-scheduler-arg="policy-config-file=/etc/kubernetes/scheduler-policy-config.json" \
--kube-scheduler-arg="leader-elect=true"
This is my scheduler-policy-config.json :
{
"kind": "Policy",
"apiVersion": "v1",
"extenders": [
{
"urlPrefix": "http://127.0.0.1:32766/gpushare-scheduler",
"filterVerb": "filter",
"bindVerb": "bind",
"enableHttps": false,
"nodeCacheCapable": true,
"managedResources": [
{
"name": "aliyun.com/gpu-mem",
"ignoredByScheduler": false
}
],
"ignorable": false
}
]
}
@jonn-yan Hi, I'm not sure if this is compatible with k3s 1.23. I assume it is.
This file is used as a configuration file, replacing the former json configuration file. It's not an API resource to apply. Thus you should
/etc/kubernetes/kube-scheduler-config.yaml
config=/etc/kubernetes/kube-scheduler-config.yaml
to kube-scheduler-arg ( replacing former policy-config-file=blablabla
)不大确定是不是和k3s兼容,就假设它兼容吧。
这是一个配置文件,主要是替换掉了之前的json配置文件,不是用来apply的。也就是说你得:
/etc/kubernetes/kube-scheduler-config.yaml
(别把其他文件覆盖了)config=/etc/kubernetes/kube-scheduler-config.yaml
,替换掉之前的policy-config-file=啥啥啥
@jeffguorg Thank you very much, I verified no problem .
from installation instruction:
but the option changed since v1.23, referring to kubernetes document: https://kubernetes.io/docs/reference/scheduling/policies/ instead we could use --config=KubeSchedulerConfiguration
and the config file has a apiVersion of
kube-scheduler-config.v1beta3
, for example:and
tested on my machine and works well