Closed katoomegumi closed 7 months ago
cc @yuey002 PTAL
@katoomegumi Scheduling the basic pod workload works for me. I think it's most likely the godel system was not installed correctly or ran into errors in your env. Would you mind running the below commands to check the status of godel components?
check godel dispatcher/scheduler/binder status; and if any pod is not Running, use describe to check for details
kubectl get pods -n godel-system
if all godel pods are running, check for logs (replace the pod name with your dispatcher pod)
kubectl logs dispatcher-76bcfcb9d7-jtlcx -n godel-system | grep -i basic
@yuey002 these pods are not running
$ kubectl get pods -n godel-system
NAME READY STATUS RESTARTS AGE
binder-556bcdcfdd-z9r79 0/1 CrashLoopBackOff 189 (2m19s ago) 15h
dispatcher-6f444dc587-dzn4h 0/1 CrashLoopBackOff 189 (74s ago) 15h
scheduler-7694d9dbdd-vmjlh 0/1 CrashLoopBackOff 189 (2m45s ago) 15h
$ kubectl describe pods -n godel-system
Name: binder-556bcdcfdd-z9r79
Namespace: godel-system
Priority: 0
Service Account: godel
Node: godel-demo-default-control-plane/172.19.0.3
Start Time: Tue, 26 Mar 2024 00:48:24 +0800
Labels: app=binder
pod-template-hash=556bcdcfdd
Annotations: <none>
Status: Running
IP: 10.244.0.5
IPs:
IP: 10.244.0.5
Controlled By: ReplicaSet/binder-556bcdcfdd
Containers:
binder:
Container ID: containerd://16d9e144c58fef6e4ff0b2d123b878e3f7580eaca5b1bb64be9eb55f9cf52632
Image: godel-local:latest
Image ID: docker.io/library/import-2024-03-25@sha256:98605c312771b5ab50725c193db8c9212f3d7c1ab78269f98dde3046b40fe254
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/binder
Args:
--leader-elect=false
--tracer=noop
--v=5
--config=/config/binder.config
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 26 Mar 2024 16:36:25 +0800
Finished: Tue, 26 Mar 2024 16:36:25 +0800
Ready: False
Restart Count: 190
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Environment: <none>
Mounts:
/config from binder-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b46np (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
binder-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: godel-binder-config
Optional: false
kube-api-access-b46np:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: node-role.kubernetes.io/control-plane=
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 2m47s (x4369 over 15h) kubelet Back-off restarting failed container binder in pod binder-556bcdcfdd-z9r79_godel-system(5db9f4cc-8e24-4712-8b4f-ba5e267263c7)
Name: dispatcher-6f444dc587-dzn4h
Namespace: godel-system
Priority: 0
Service Account: godel
Node: godel-demo-default-control-plane/172.19.0.3
Start Time: Tue, 26 Mar 2024 00:48:24 +0800
Labels: app=godel-dispatcher
pod-template-hash=6f444dc587
Annotations: <none>
Status: Running
IP: 10.244.0.6
IPs:
IP: 10.244.0.6
Controlled By: ReplicaSet/dispatcher-6f444dc587
Containers:
dispatcher:
Container ID: containerd://2312785d6170d2239845259942210693a77e9d269777abd43462948f29126a49
Image: godel-local:latest
Image ID: docker.io/library/import-2024-03-25@sha256:98605c312771b5ab50725c193db8c9212f3d7c1ab78269f98dde3046b40fe254
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/dispatcher
Args:
--leader-elect=false
--tracer=noop
--v=5
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 26 Mar 2024 16:37:34 +0800
Finished: Tue, 26 Mar 2024 16:37:34 +0800
Ready: False
Restart Count: 190
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2p59d (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-2p59d:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: node-role.kubernetes.io/control-plane=
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 2m56s (x4369 over 15h) kubelet Back-off restarting failed container dispatcher in pod dispatcher-6f444dc587-dzn4h_godel-system(eafebf02-eb7c-48c7-9e06-8841e41c9e80)
Name: scheduler-7694d9dbdd-vmjlh
Namespace: godel-system
Priority: 0
Service Account: godel
Node: godel-demo-default-control-plane/172.19.0.3
Start Time: Tue, 26 Mar 2024 00:48:24 +0800
Labels: app=godel-scheduler
pod-template-hash=7694d9dbdd
Annotations: <none>
Status: Running
IP: 10.244.0.7
IPs:
IP: 10.244.0.7
Controlled By: ReplicaSet/scheduler-7694d9dbdd
Containers:
scheduler:
Container ID: containerd://717da0fc5ef5b8fb407af31ca48c5e103796b4142412e111c01f10a3f21ea8bb
Image: godel-local:latest
Image ID: docker.io/library/import-2024-03-25@sha256:98605c312771b5ab50725c193db8c9212f3d7c1ab78269f98dde3046b40fe254
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/scheduler
Args:
--leader-elect=false
--tracer=noop
--v=4
--disable-preemption=false
--config=/config/scheduler.config
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 26 Mar 2024 16:41:01 +0800
Finished: Tue, 26 Mar 2024 16:41:01 +0800
Ready: False
Restart Count: 191
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Environment: <none>
Mounts:
/config from scheduler-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c58tz (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
scheduler-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: godel-scheduler-config
Optional: false
kube-api-access-c58tz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: node-role.kubernetes.io/control-plane=
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 2m47s (x4369 over 15h) kubelet Back-off restarting failed container scheduler in pod scheduler-7694d9dbdd-vmjlh_godel-system(a13880a0-5efe-4b07-b003-5371d361a712)
@katoomegumi Thank you! Could you also show me the logs
kubectl logs scheduler-7694d9dbdd-vmjlh -n godel-system
kubectl logs dispatcher-6f444dc587-dzn4h -n godel-system
kubectl logs binder-556bcdcfdd-z9r79 -n godel-system
@yuey002 I'm sorry that I have tried, but it failed and output is as follows.
/usr/local/bin/scheduler: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/local/bin/scheduler)
/usr/local/bin/scheduler: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/local/bin/scheduler)
/usr/local/bin/dispatcher: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/local/bin/dispatcher)
/usr/local/bin/dispatcher: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/local/bin/dispatcher)
/usr/local/bin/binder: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/local/bin/binder)
/usr/local/bin/binder: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/local/bin/binder)
It shows that the 'GLIBC_2.32' and 'GLIBC_2.34' not found, but I can find them.
strings /lib/x86_64-linux-gnu/libc.so.6 |grep GLIBC_
GLIBC_2.2.5
GLIBC_2.2.6
GLIBC_2.3
GLIBC_2.3.2
GLIBC_2.3.3
GLIBC_2.3.4
GLIBC_2.4
GLIBC_2.5
GLIBC_2.6
GLIBC_2.7
GLIBC_2.8
GLIBC_2.9
GLIBC_2.10
GLIBC_2.11
GLIBC_2.12
GLIBC_2.13
GLIBC_2.14
GLIBC_2.15
GLIBC_2.16
GLIBC_2.17
GLIBC_2.18
GLIBC_2.22
GLIBC_2.23
GLIBC_2.24
GLIBC_2.25
GLIBC_2.26
GLIBC_2.27
GLIBC_2.28
GLIBC_2.29
GLIBC_2.30
GLIBC_2.31
GLIBC_2.32
GLIBC_2.33
GLIBC_2.34
GLIBC_2.35
GLIBC_PRIVATE
@katoomegumi Thanks for the info. I believe there's some compatibility issues between the local env and docker image. I made some quick fixes to move the build process into Dockerfile.
https://github.com/yuey002/godel-scheduler/tree/dev/yuey002/fix-dockerfile When getting a chance, could you clone my forked repo and check out 'dev/yuey002/fix-dockerfile', go over the quick start to see if it can fix your issue? I have verified in my env, but would like to see if that works in yours too. Thank you!
@yuey002 sorry, I copy the branch 'fix-dockerfile' and the error still the same.
$ kubectl get pods -n godel-system
NAME READY STATUS RESTARTS AGE
binder-8b46dbd65-cblps 0/1 CrashLoopBackOff 3 (32s ago) 74s
dispatcher-69f7d646b8-kl52j 0/1 CrashLoopBackOff 3 (32s ago) 74s
scheduler-59cbb6c57-w2f44 0/1 CrashLoopBackOff 3 (34s ago) 74s
$ kubectl describe pods -n godel-system
Name: binder-8b46dbd65-cblps
Namespace: godel-system
Priority: 0
Service Account: godel
Node: godel-demo-default-control-plane/172.19.0.2
Start Time: Thu, 28 Mar 2024 16:22:58 +0800
Labels: app=binder
pod-template-hash=8b46dbd65
Annotations: <none>
Status: Running
IP: 10.244.0.6
IPs:
IP: 10.244.0.6
Controlled By: ReplicaSet/binder-8b46dbd65
Containers:
binder:
Container ID: containerd://7112861a74be037a524e833eee8770de5820d9dea455c2d522568c919de3eb90
Image: godel-local:latest
Image ID: docker.io/library/import-2024-03-28@sha256:a6c52c60e2e7e7b847aa3be08c40fb1cb89031f569dad9c77f8ee1dd6677dba3
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/binder
Args:
--leader-elect=false
--tracer=noop
--v=5
--config=/config/binder.config
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 28 Mar 2024 16:23:40 +0800
Finished: Thu, 28 Mar 2024 16:23:40 +0800
Ready: False
Restart Count: 3
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Environment: <none>
Mounts:
/config from binder-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sshtl (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
binder-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: godel-binder-config
Optional: false
kube-api-access-sshtl:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: node-role.kubernetes.io/control-plane=
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 89s default-scheduler Successfully assigned godel-system/binder-8b46dbd65-cblps to godel-demo-default-control-plane
Normal Pulled 47s (x4 over 88s) kubelet Container image "godel-local:latest" already present on machine
Normal Created 47s (x4 over 88s) kubelet Created container binder
Normal Started 47s (x4 over 88s) kubelet Started container binder
Warning BackOff 9s (x7 over 85s) kubelet Back-off restarting failed container binder in pod binder-8b46dbd65-cblps_godel-system(53fd92af-2045-46af-b298-fea2a7a64dae)
Name: dispatcher-69f7d646b8-kl52j
Namespace: godel-system
Priority: 0
Service Account: godel
Node: godel-demo-default-control-plane/172.19.0.2
Start Time: Thu, 28 Mar 2024 16:22:58 +0800
Labels: app=godel-dispatcher
pod-template-hash=69f7d646b8
Annotations: <none>
Status: Running
IP: 10.244.0.5
IPs:
IP: 10.244.0.5
Controlled By: ReplicaSet/dispatcher-69f7d646b8
Containers:
dispatcher:
Container ID: containerd://c690936e9a194622d9dbd6526323b5ec8943a2779b44c2f748ef04f48e28a13c
Image: godel-local:latest
Image ID: docker.io/library/import-2024-03-28@sha256:a6c52c60e2e7e7b847aa3be08c40fb1cb89031f569dad9c77f8ee1dd6677dba3
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/dispatcher
Args:
--leader-elect=false
--tracer=noop
--v=5
State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 28 Mar 2024 16:24:22 +0800
Finished: Thu, 28 Mar 2024 16:24:22 +0800
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 28 Mar 2024 16:23:40 +0800
Finished: Thu, 28 Mar 2024 16:23:40 +0800
Ready: False
Restart Count: 4
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wdrzz (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-wdrzz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: node-role.kubernetes.io/control-plane=
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 89s default-scheduler Successfully assigned godel-system/dispatcher-69f7d646b8-kl52j to godel-demo-default-control-plane
Normal Pulled 5s (x5 over 88s) kubelet Container image "godel-local:latest" already present on machine
Normal Created 5s (x5 over 88s) kubelet Created container dispatcher
Normal Started 5s (x5 over 88s) kubelet Started container dispatcher
Warning BackOff 4s (x7 over 85s) kubelet Back-off restarting failed container dispatcher in pod dispatcher-69f7d646b8-kl52j_godel-system(4e2362a1-7079-4389-ba1e-8b7197e3e6ac)
Name: scheduler-59cbb6c57-w2f44
Namespace: godel-system
Priority: 0
Service Account: godel
Node: godel-demo-default-control-plane/172.19.0.2
Start Time: Thu, 28 Mar 2024 16:22:58 +0800
Labels: app=godel-scheduler
pod-template-hash=59cbb6c57
Annotations: <none>
Status: Running
IP: 10.244.0.7
IPs:
IP: 10.244.0.7
Controlled By: ReplicaSet/scheduler-59cbb6c57
Containers:
scheduler:
Container ID: containerd://b2fdffd6c0cb1451684b708f91f1005133cdc602289d4c406ddd4714a1689ac2
Image: godel-local:latest
Image ID: docker.io/library/import-2024-03-28@sha256:a6c52c60e2e7e7b847aa3be08c40fb1cb89031f569dad9c77f8ee1dd6677dba3
Port: <none>
Host Port: <none>
Command:
/usr/local/bin/scheduler
Args:
--leader-elect=false
--tracer=noop
--v=4
--disable-preemption=false
--config=/config/scheduler.config
State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 28 Mar 2024 16:24:21 +0800
Finished: Thu, 28 Mar 2024 16:24:21 +0800
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 28 Mar 2024 16:23:38 +0800
Finished: Thu, 28 Mar 2024 16:23:38 +0800
Ready: False
Restart Count: 4
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
Environment: <none>
Mounts:
/config from scheduler-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vz24r (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
scheduler-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: godel-scheduler-config
Optional: false
kube-api-access-vz24r:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: node-role.kubernetes.io/control-plane=
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 89s default-scheduler Successfully assigned godel-system/scheduler-59cbb6c57-w2f44 to godel-demo-default-control-plane
Normal Pulled 6s (x5 over 88s) kubelet Container image "godel-local:latest" already present on machine
Normal Created 6s (x5 over 88s) kubelet Created container scheduler
Normal Started 6s (x5 over 88s) kubelet Started container scheduler
Warning BackOff 5s (x7 over 85s) kubelet Back-off restarting failed container scheduler in pod scheduler-59cbb6c57-w2f44_godel-system(c8e33ae7-6cd5-4ca6-93a2-d67726d33c47)
$ kubectl logs scheduler-59cbb6c57-w2f44 -n godel-system
/usr/local/bin/scheduler: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/local/bin/scheduler)
/usr/local/bin/scheduler: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/local/bin/scheduler)
@katoomegumi Thanks for retrying and sharing the details! This error looks weird to me, since we are running the same Dockerfile which include all the build process... I think it may be because the old godel-local image was not cleaned up properly.
I made a few changes additionally to my branch 'dev/yuey002/fix-dockerfile' for my forked repo https://github.com/yuey002/godel-scheduler/tree/dev/yuey002/fix-dockerfile. Specifically, I used debian:latest for the base image which have GLIBC version up to 2.36. When getting a chance, could you quickly try:
1- check out my branch
git checkout dev/yuey002/fix-dockerfile
2- set up the local cluster env. Could you please also paste me the output for 'make local-up'?
make local-up
3- check godel component pods status
kubectl get po -n godel-system
4- if still the same error, ssh into the pod and check the glibc version
kubectl exec -it scheduler-77cfcb585d-ldzqp -n godel-system -- /bin/bash
root@scheduler-77cfcb585d-ldzqp:~# ldd --version
Thanks.
@yuey002 Thanks for your reply! I tried as your words. I clone the branch again and check it.
$ git checkout dev/yuey002/fix-dockerfile
M manifests/quickstart-feature-examples/godel-demo-default.yaml
Already on 'dev/yuey002/fix-dockerfile'
Your branch is up to date with 'origin/dev/yuey002/fix-dockerfile'.
Because my kubectl version is 1.29.2, I change the 'manifests/quickstart-feature-examples/godel-demo-default.yaml'. I think it won't cause the error.
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: godel-demo-default
nodes:
- role: control-plane
image: kindest/node:v1.29.2
- role: worker
image: kindest/node:v1.29.2
Here is the output for make local up
$ make local-up
find: ‘build’: No such file or directory
dirname: missing operand
Try 'dirname --help' for more information.
./hack/make-rules/build-images.sh
Building docker image(s) for ...
Total reclaimed space: 0B
Total reclaimed space: 0B
Untagged: godel-local:5d8c5de
Untagged: godel-local:latest
Deleted: sha256:883a86e804057fedd3538351e951ebbbdc68ab60d047aa441d8f54e3895558ba
Error response from daemon: No such image: 883a86e80405:latest
[+] Building 16.1s (21/21) FINISHED docker:default
=> [internal] load build definition from godel-local.Dockerfile 0.0s
=> => transferring dockerfile: 661B 0.0s
=> [internal] load metadata for docker.io/library/debian:latest 15.3s
=> [internal] load metadata for docker.io/library/golang:1.21 0.9s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [builder 1/11] FROM docker.io/library/golang:1.21@sha256:856073656d1a517517792e6cdd2f7a5ef080d3ca2dff33e518c8412f140fdd2d 0.0s
=> [internal] load build context 0.7s
=> => transferring context: 634.51kB 0.7s
=> [stage-1 1/4] FROM docker.io/library/debian:latest@sha256:2906804d2a64e8a13a434a1a127fe3f6a28bf7cf3696be4223b06276f32f1f2d 0.0s
=> CACHED [stage-1 2/4] RUN apt-get update && apt-get install -y binutils && apt-get clean && ldd --version 0.0s
=> CACHED [stage-1 3/4] WORKDIR /root 0.0s
=> CACHED [builder 2/11] WORKDIR /workspace 0.0s
=> CACHED [builder 3/11] COPY go.mod go.mod 0.0s
=> CACHED [builder 4/11] COPY go.sum go.sum 0.0s
=> CACHED [builder 5/11] COPY cmd/ cmd/ 0.0s
=> CACHED [builder 6/11] COPY pkg/ pkg/ 0.0s
=> CACHED [builder 7/11] COPY hack/ hack/ 0.0s
=> CACHED [builder 8/11] COPY vendor/ vendor/ 0.0s
=> CACHED [builder 9/11] COPY Makefile Makefile 0.0s
=> CACHED [builder 10/11] COPY Makefile.expansion Makefile.expansion 0.0s
=> CACHED [builder 11/11] RUN export GO_BUILD_PLATFORMS=linux/amd64 && make build 0.0s
=> CACHED [stage-1 4/4] COPY --from=builder /workspace/bin/linux_amd64/* /usr/local/bin/ 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:883a86e804057fedd3538351e951ebbbdc68ab60d047aa441d8f54e3895558ba 0.0s
=> => naming to docker.io/library/godel-local:5d8c5de 0.0s
bash ./hack/make-rules/local-up.sh godel-demo-default
+++ dirname ./hack/make-rules/local-up.sh
++ cd ./hack/make-rules/../..
++ pwd -P
+ REPO_ROOT=/home/szp/godel-scheduler
+ CLUSTER_NAME=godel-demo-default
+ create_cluster /home/szp/godel-scheduler/manifests/quickstart-feature-examples/godel-demo-default.yaml
+ local cluster_config=/home/szp/godel-scheduler/manifests/quickstart-feature-examples/godel-demo-default.yaml
+ nohup kind delete cluster --name=godel-demo-default
+ kind create cluster --config=/home/szp/godel-scheduler/manifests/quickstart-feature-examples/godel-demo-default.yaml
Creating cluster "godel-demo-default" ...
✓ Ensuring node image (kindest/node:v1.29.2) 🖼
✓ Preparing nodes 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining worker nodes 🚜
Set kubectl context to "kind-godel-demo-default"
You can now use your cluster with:
kubectl cluster-info --context kind-godel-demo-default
Thanks for using kind! 😊
+ kind load docker-image --nodes godel-demo-default-control-plane godel-local:latest --name godel-demo-default
Image: "godel-local:latest" with ID "sha256:883a86e804057fedd3538351e951ebbbdc68ab60d047aa441d8f54e3895558ba" not yet present on node "godel-demo-default-control-plane", loading...
+ kustomize build /home/szp/godel-scheduler/manifests/base
+ kubectl apply -f -
# Warning: 'bases' is deprecated. Please use 'resources' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
namespace/godel-system created
customresourcedefinition.apiextensions.k8s.io/customnoderesources.node.katalyst.kubewharf.io created
customresourcedefinition.apiextensions.k8s.io/nmnodes.node.godel.kubewharf.io created
customresourcedefinition.apiextensions.k8s.io/podgroups.scheduling.godel.kubewharf.io created
customresourcedefinition.apiextensions.k8s.io/schedulers.scheduling.godel.kubewharf.io created
serviceaccount/godel created
clusterrole.rbac.authorization.k8s.io/godel created
clusterrolebinding.rbac.authorization.k8s.io/godel created
configmap/godel-binder-config created
configmap/godel-scheduler-config created
deployment.apps/binder created
deployment.apps/dispatcher created
deployment.apps/scheduler created
And I can't connect to it. I check the container and there's no that container named 'scheduler'. I don't know if the problem is creating container of scheduler?
$ kubectl exec -it scheduler-59cbb6c57-pxj8t -n godel-system -- /bin/bash
error: unable to upgrade connection: container not found ("scheduler")
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0979d0615a86 kindest/node:v1.29.2 "/usr/local/bin/entr…" 20 minutes ago Up 20 minutes 127.0.0.1:41913->6443/tcp godel-demo-default-control-plane
2c67ed533c3f kindest/node:v1.29.2 "/usr/local/bin/entr…" 20 minutes ago Up 20 minutes godel-demo-default-worker
cb2cbd3be06e deathstarbench/social-network-microservices:latest "PostStorageService" 5 weeks ago Up 8 days 0.0.0.0:10002->9090/tcp, :::10002->9090/tcp socialnetwork_post-storage-service_1
673b45159edf deathstarbench/social-network-microservices:latest "MediaService" 5 weeks ago Up 8 days socialnetwork_media-service_1
d7d4a2f122b0 deathstarbench/social-network-microservices:latest "SocialGraphService" 5 weeks ago Up 8 days socialnetwork_social-graph-service_1
eaac26a830ae deathstarbench/social-network-microservices:latest "UserService" 5 weeks ago Up 8 days socialnetwork_user-service_1
61d63b4ca889 yg397/openresty-thrift:xenial "/usr/local/openrest…" 5 weeks ago Up 8 days 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp socialnetwork_nginx-thrift_1
81640a1c7722 yg397/media-frontend:xenial "/usr/local/openrest…" 5 weeks ago Up 8 days 0.0.0.0:8081->8080/tcp, :::8081->8080/tcp socialnetwork_media-frontend_1
2918ab8242d0 deathstarbench/social-network-microservices:latest "UserTimelineService" 5 weeks ago Up 8 days socialnetwork_user-timeline-service_1
3aa3d82dcb10 deathstarbench/social-network-microservices:latest "UserMentionService" 5 weeks ago Up 8 days socialnetwork_user-mention-service_1
06389ac3c30c deathstarbench/social-network-microservices:latest "UniqueIdService" 5 weeks ago Up 8 days socialnetwork_unique-id-service_1
39fc3ee2d205 deathstarbench/social-network-microservices:latest "HomeTimelineService" 5 weeks ago Up 8 days socialnetwork_home-timeline-service_1
c29b6c1a32bc deathstarbench/social-network-microservices:latest "ComposePostService" 5 weeks ago Up 8 days socialnetwork_compose-post-service_1
c073fdab9dfa deathstarbench/social-network-microservices:latest "UrlShortenService" 5 weeks ago Up 8 days socialnetwork_url-shorten-service_1
4e0aaa449d59 deathstarbench/social-network-microservices:latest "TextService" 5 weeks ago Up 8 days socialnetwork_text-service_1
51a699339b56 jaegertracing/all-in-one:latest "/go/bin/all-in-one-…" 5 weeks ago Up 8 days 4317-4318/tcp, 5775/udp, 5778/tcp, 9411/tcp, 14250/tcp, 14268/tcp, 6831-6832/udp, 0.0.0.0:16686->16686/tcp, :::16686->16686/tcp socialnetwork_jaeger-agent_1
da52653b9488 mongo:4.4.6 "docker-entrypoint.s…" 5 weeks ago Up 8 days 27017/tcp socialnetwork_url-shorten-mongodb_1
f1f5675ed65d redis "docker-entrypoint.s…" 5 weeks ago Up 8 days 6379/tcp socialnetwork_social-graph-redis_1
c1a6e49ad8b6 mongo:4.4.6 "docker-entrypoint.s…" 5 weeks ago Up 8 days 27017/tcp socialnetwork_user-timeline-mongodb_1
ed35bcf9acb0 mongo:4.4.6 "docker-entrypoint.s…" 5 weeks ago Up 8 days 27017/tcp socialnetwork_social-graph-mongodb_1
5103515e1671 memcached "docker-entrypoint.s…" 5 weeks ago Up 8 days 11211/tcp socialnetwork_user-memcached_1
ed5a645dfdd0 redis "docker-entrypoint.s…" 5 weeks ago Up 8 days 6379/tcp socialnetwork_home-timeline-redis_1
8197de30b166 memcached "docker-entrypoint.s…" 5 weeks ago Up 8 days 11211/tcp socialnetwork_media-memcached_1
f91275667be7 mongo:4.4.6 "docker-entrypoint.s…" 5 weeks ago Up 8 days 27017/tcp socialnetwork_media-mongodb_1
9273e04185ad mongo:4.4.6 "docker-entrypoint.s…" 5 weeks ago Up 8 days 27017/tcp socialnetwork_user-mongodb_1
3611ab173907 mongo:4.4.6 "docker-entrypoint.s…" 5 weeks ago Up 8 days 27017/tcp socialnetwork_post-storage-mongodb_1
82d545d7f960 redis "docker-entrypoint.s…" 5 weeks ago Up 8 days 6379/tcp socialnetwork_user-timeline-redis_1
7deabbfba02a memcached "docker-entrypoint.s…" 5 weeks ago Up 8 days 11211/tcp socialnetwork_post-storage-memcached_1
cbdf730f77f5 memcached "docker-entrypoint.s…" 5 weeks ago Up 8 days 11211/tcp socialnetwork_url-shorten-memcached_1
@katoomegumi Thanks for this follow-up. Is your scheduler-59cbb6c57-pxj8t pod still the same error logs?
kubectl logs scheduler-59cbb6c57-pxj8t -n godel-system
kubectl logs scheduler-59cbb6c57-pxj8t -n godel-system
yes.
v/usr/local/bin/scheduler: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/local/bin/scheduler)
/usr/local/bin/scheduler: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/local/bin/scheduler)
@katoomegumi Thanks! We'd need a bit more information for your local env to further triage this issue. Could you check the digests for debian:latest
docker images --digests | grep -i debian | grep -i latest
The expected digests sha should be e97ee92bf1e11a2de654e9f3da827d8dce32b54e0490ac83bfc65c8706568116
. If not, then the debian image is probably the root cause.
@yuey002 When I execute the command, it give no output. It seems that there's no images named debian. I pull debian image and get the result. What should I do to fix it?
debian latest sha256:2906804d2a64e8a13a434a1a127fe3f6a28bf7cf3696be4223b06276f32f1f2d 6f4986d78878 2 years ago 124MB
@katoomegumi I see, it's most likely due to the older debian image then. One thing you could do is to delete your local debian:latest image, and then pull that image again. For how to delete the image, https://docs.docker.com/reference/cli/docker/image/rm/
To make things easier, I modified my Dockerfile to pin a specific version for debian (debian:bookworm
). If you pull dev/yuey002/fix-dockerfile
branch again you should be able to see the change. Run make local-up
again to see if the three godel pods can be up and running this time.
Below is a screenshot of my output. You could see debian:bookworm
as well as the digest SHA of the image.
Thanks.
@yuey002 Thanks very much, it works.
@NickrenREN Could you help take a look at https://github.com/kubewharf/godel-scheduler/pull/39 ? It's some improvements for the local env set-up, so that we can prevent similar issues in the future. After it's merged I think we can close this issue. Thanks.
@yuey002 Sure, get it merged, thanks for the fix.
after execute
$ kubectl apply -f manifests/quickstart-feature-examples/basic-pod-scheduling/deployment.yaml
, pods is pending but not running.There's no log output and no pod events.
I edit the
godel-demo-default.yaml
file. Change theimage: kindest/node:v1.21.1
tov1.29.2
.Environment
kubectl version: 1.29.2 docker version: 26.0.0 kind version: v0.22.0 go version: 1.22.0 kustomize version: v5.3.0