kahkhang / kube-linode

:whale: Provision a Kubernetes/CoreOS cluster on Linode
MIT License
212 stars 31 forks source link

Storage:ceph:rook. No nodes are available that match all of the following predicates:: MatchNodeSelector (1), PodToleratesNodeTaints (1). #42

Closed c835722 closed 7 years ago

c835722 commented 7 years ago

After my 2 node cluster has been running for a day or so I notice a number of the pods in the rook namespace have the following status.

No nodes are available that match all of the following predicates:: MatchNodeSelector (1), PodToleratesNodeTaints (1).

Pods are in pending state.

What debugging information can I supply that is more pertinent to resolve issue? What is the next move to get back to healthy state?

kahkhang commented 7 years ago

It means that there is not enough resources to schedule those pods. You can increase the size of your node, or create another node in order to schedule these pods.

c835722 commented 7 years ago

ok. I did resize the worker node from 1G to 2G post the build. Im trying to get the balance right between costs and kube functionality optimal. Will feed back my actions and outcomes.

c835722 commented 7 years ago

Perhaps a minimum config needs to be stated to prevent users having to expend the time to experiment with this capacity issue. Or perhaps as part of testing that is planned a validation of minimum capacity should be done. (As the stack gets higher the capacity required per release may go up, or as tuning goes on it may come down)

c835722 commented 7 years ago

going to try a reboot on the worker and just monitor that machine resources are stable before I move up the capacity config.

c835722 commented 7 years ago

After worker node restart only 1 pod failed to recover. rook-ceph-mgr0-2487684371-3hvs5 rook 172.104.175.126 Waiting: CrashLoopBackOff 47 7 hours Error syncing pod Back-off restarting failed container

kahkhang commented 7 years ago

You should remove the pod and the replication container will start a new one.

kahkhang commented 7 years ago

Please feel free to open this issue again if the pod still doesn't come back up.