scylladb / scylla-operator

The Kubernetes Operator for ScyllaDB
https://operator.docs.scylladb.com/
Apache License 2.0
339 stars 175 forks source link

Unable to schedule cluster pods to specific instancegroup #170

Closed Darwiner closed 4 years ago

Darwiner commented 4 years ago

Describe the bug After defining the cluster to be created, the pods do not seem to be able to match with the specified instancegroup and do not start, as they do not seem to be able to find the target nodes.

Each of the created nodes in the scylla instancegroup in question are not running anything else and have enough ressources (cpu/mem) available.

To Reproduce kubectl create -f examples/generic/test-cluster.yaml

Expected behavior A pod should be running on 3 different nodes, each node being located in the same region, on a different AZ.

Config Files test-cluster.yaml https://paste.centos.org/view/ff600bf8

Logs

# kubectl get pods
NAME                                  READY   STATUS    RESTARTS   AGE
test-cluster-us-east-1-us-east-1a-0   0/2     Pending   0          10m
# kubectl describe pod test-cluster-us-east-1-us-east-1a-0

https://paste.centos.org/view/1964aebd

kubectl get pod test-cluster-us-east-1-us-east-1a-0 -o yaml

https://paste.centos.org/view/5a53c45e

# kops get ig scylla
Using cluster from kubectl context: use1-test.k8s-dev.example.com

NAME    ROLE    MACHINETYPE     MIN     MAX     ZONES
scylla  Node    c5.large        3       5       us-east-1a,us-east-1b,us-east-1f

Environment:

zimnx commented 4 years ago

Your paste.centos.org links will be deleted after 24h after posting, I moved them to unlisted pastebin: test-cluster.yaml - https://pastebin.com/9B4aherN kubectl describe pod test-cluster-us-east-1-us-east-1a-0 - https://pastebin.com/TPVsRBUR kubectl get pod test-cluster-us-east-1-us-east-1a-0 -o yaml - https://pastebin.com/fEvf1sWf

Can you also provide details of us-east-1a node? Maybe there aren't enough taints there.

mmatczuk commented 4 years ago

Please keep them in GH issue not in external links.

Darwiner commented 4 years ago

Here's a kubectl describe node on one of the 3 nodes that were brought up into the scylla instancegroup.

Name:               ip-10-10-0-48.ec2.internal
Roles:              node
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=c5.large
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-1
                    failure-domain.beta.kubernetes.io/zone=us-east-1a
                    kops.k8s.io/instancegroup=scylla
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-10-0-48.ec2.internal
                    kubernetes.io/os=linux
                    kubernetes.io/role=node
                    node-role.kubernetes.io/node=
Annotations:        node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 10.10.0.48/24
                    projectcalico.org/IPv4IPIPTunnelAddr: 100.122.47.0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 04 Aug 2020 14:37:53 -0400
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Tue, 04 Aug 2020 14:38:04 -0400   Tue, 04 Aug 2020 14:38:04 -0400   CalicoIsUp                   Calico is running on this node
  MemoryPressure       False   Wed, 05 Aug 2020 08:50:05 -0400   Tue, 04 Aug 2020 14:37:53 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Wed, 05 Aug 2020 08:50:05 -0400   Tue, 04 Aug 2020 14:37:53 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Wed, 05 Aug 2020 08:50:05 -0400   Tue, 04 Aug 2020 14:37:53 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Wed, 05 Aug 2020 08:50:05 -0400   Tue, 04 Aug 2020 14:38:03 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.10.0.48
  ExternalIP:   54.159.75.66
  Hostname:     ip-10-10-0-48.ec2.internal
  InternalDNS:  ip-10-10-0-48.ec2.internal
  ExternalDNS:  ec2-54-159-75-66.compute-1.amazonaws.com
Capacity:
 attachable-volumes-aws-ebs:  25
 cpu:                         2
 ephemeral-storage:           125753328Ki
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      3805088Ki
 pods:                        110
Allocatable:
 attachable-volumes-aws-ebs:  25
 cpu:                         2
 ephemeral-storage:           115894266893
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      3702688Ki
 pods:                        110
System Info:
 Machine ID:                 ec2c5ed4075aecd4fc44ed2dcd0fcce2
 System UUID:                EC2C5ED4-075A-ECD4-FC44-ED2DCD0FCCE2
 Boot ID:                    772a4ef2-4c94-438a-90ef-e555df733abe
 Kernel Version:             4.9.0-13-amd64
 OS Image:                   Debian GNU/Linux 9 (stretch)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.6.3
 Kubelet Version:            v1.15.9
 Kube-Proxy Version:         v1.15.9
PodCIDR:                     100.96.28.0/24
ProviderID:                  aws:///us-east-1a/i-0c76c46574370be51
Non-terminated Pods:         (5 in total)
  Namespace                  Name                                                  CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                                  ------------  ----------  ---------------  -------------  ---
  kube-system                calico-node-fnqfr                                     100m (5%)     0 (0%)      0 (0%)           0 (0%)         18h
  kube-system                k8s-base-kube2iam-n5tgx                               50m (2%)      50m (2%)    50Mi (1%)        100Mi (2%)     18h
  kube-system                kube-proxy-ip-10-10-0-48.ec2.internal                 100m (5%)     0 (0%)      0 (0%)           0 (0%)         18h
  sre-monitoring             prometheus-operator-prometheus-node-exporter-6zj55    0 (0%)        0 (0%)      0 (0%)           0 (0%)         18h
  vlad-rundeck               rundeck-6467fccd9d-8rhwd                              500m (25%)    1 (50%)     512Mi (14%)      1Gi (28%)      25m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         750m (37%)   1050m (52%)
  memory                      562Mi (15%)  1124Mi (31%)
  ephemeral-storage           0 (0%)       0 (0%)
  attachable-volumes-aws-ebs  0            0
Events:                       <none>

PS. Someone might want to change the bug report template, as it mentions using a pastebin for config files.

mmatczuk commented 4 years ago

PS. Someone might want to change the bug report template, as it mentions using a pastebin for config files.

Done

zimnx commented 4 years ago

@Darwiner Your Scylla Cluster requires at least 2 CPUs (resources.requests), and scylla instancegroup is c5.large which has 2 CPUs. Nodes are being used by other pods too:

  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         750m (37%)   1050m (52%)
  memory                      562Mi (15%)  1124Mi (31%)
  ephemeral-storage           0 (0%)       0 (0%)
  attachable-volumes-aws-ebs  0            0

You can try either upgrading nodes to bigger tier, or change limit resources of Scylla Cluster to some lower values.

Darwiner commented 4 years ago

@zimnx Ugh, thanks. That seems to have been the issue, lack of allocatable ressources...

I've now made the 3 nodes that I created via the "scylla" instancegroup to be c5.xlarge instead, and also added a taint of dedicated=scylla:NoSchedule to the nodes in addition to a toleration in the cluster definition. That should at least take care of blocking the scheduler from starting pods onto these nodes unless they have tolerations in place.

Cluster is up and running now.

Thanks!