cloudfoundry-incubator / kubo-deployment

Contains manifests used to deploy Cloud Foundry Container Runtime
https://www.cloudfoundry.org/container-runtime/
Apache License 2.0
275 stars 114 forks source link

Kubernetes nodes restarting continuously #333

Closed ravichandra22 closed 6 years ago

ravichandra22 commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind feature

/kind "help wanted" /kind bug

What happened:

I have deployed a K8's cluster on Vsphere Cluster with 5 nodes (DRS enabled) with in 2 days nodes started restarting automatically,

Once a node started rebooting all Pods on node are migrating to other available node leading to crash other node also.

A cluster with 6 nodes end up with only 1 node in Ready state.

ubuntu@ubuntuguest:$ kubectl get nodes
NAME                       STATUS     ROLES     AGE       VERSION
3a8b6912-7500-40b8-9f70-4b08303f8439   NotReady   <none>    2h        v1.10.2 
5e299d22-292c-4310-b215-c82c6783b080   Ready      <none>    5m        v1.10.2

snippet from node describe

Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 3470m (86%) 4510m (112%)
memory 7086Mi (89%) 10686Mi (135%)

What you expected to happen:

Can I limit not to overcommit, Pods on a node. Letting then to wait in Pending state till node comes to Ready state. ?

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

cf-gitbot commented 6 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/159324737

The labels on this github issue will be updated when the story is started.

seanos11 commented 6 years ago

Hi @ravichandra22 ,

It is possible to target pods to nodes, there is a kubernetes features described here: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ and https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

(Closing as answered. Please reopen if necessary)