hjacobs / kube-aws-autoscaler

Simple, elastic Kubernetes cluster autoscaler for AWS Auto Scaling Groups
GNU General Public License v3.0
94 stars 21 forks source link

Autoscaler fails with a KeyError #37

Open sgmiller opened 7 years ago

sgmiller commented 7 years ago

Kubernetes 1.7.4

2017-10-13 14:57:32,835 ERROR: Failed to autoscale Traceback (most recent call last): File "/kube_aws_autoscaler/main.py", line 387, in main include_master_nodes=args.include_master_nodes, dry_run=args.dry_run) File "/kube_aws_autoscaler/main.py", line 334, in autoscale all_nodes = get_nodes(api, include_master_nodes) File "/kube_aws_autoscaler/main.py", line 89, in get_nodes region = node.labels['failure-domain.beta.kubernetes.io/region'] KeyError: 'failure-domain.beta.kubernetes.io/region'

sgmiller commented 7 years ago

Here is my kubectl describe nodes, with pods we run redacted:

Name: ip-10-0-1-161.us-east-2.compute.internal Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=c4.xlarge beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=us-east-2 failure-domain.beta.kubernetes.io/zone=us-east-2a kubernetes.io/hostname=ip-10-0-1-161.us-east-2.compute.internal Annotations: node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true Taints: CreationTimestamp: Thu, 12 Oct 2017 19:57:10 -0500 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 19:57:10 -0500 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 19:57:10 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 19:57:10 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 19:58:00 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.0.1.161 InternalDNS: ip-10-0-1-161.us-east-2.compute.internal Hostname: ip-10-0-1-161.us-east-2.compute.internal Capacity: cpu: 4 memory: 7657020Ki pods: 110 Allocatable: cpu: 4 memory: 7554620Ki pods: 110 System Info: Machine ID: ab38f3ee81af4d80a39f295ef8b6f0bc System UUID: EC20538B-0767-592E-3059-42D35EDCF31D Boot ID: 1a0358f6-bd36-48b1-96ae-6342fdbc6fe6 Kernel Version: 4.8.0-59-generic OS Image: Ubuntu 16.10 Operating System: linux Architecture: amd64 Container Runtime Version: docker://Unknown Kubelet Version: v1.7.4 Kube-Proxy Version: v1.7.4 ExternalID: i-09c40ddb5d1e77f9e Non-terminated Pods: (26 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


kube-system kube-proxy-stdgh 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system logging-agent-3mg9j 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system weave-net-7mr6g 20m (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits


2520m (63%) 7 (175%) 1725Mi (23%) 4934Mi (66%) Events:

Name: ip-10-0-1-4.us-east-2.compute.internal Roles: master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/hostname=ip-10-0-1-4.us-east-2.compute.internal node-role.kubernetes.io/master= Annotations: node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true Taints: node-role.kubernetes.io/master:NoSchedule CreationTimestamp: Thu, 12 Oct 2017 18:30:20 -0500 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Fri, 13 Oct 2017 09:58:21 -0500 Thu, 12 Oct 2017 18:30:16 -0500 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Fri, 13 Oct 2017 09:58:21 -0500 Thu, 12 Oct 2017 18:30:16 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 13 Oct 2017 09:58:21 -0500 Thu, 12 Oct 2017 18:30:16 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Fri, 13 Oct 2017 09:58:21 -0500 Thu, 12 Oct 2017 18:31:21 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.0.1.4 Hostname: ip-10-0-1-4.us-east-2.compute.internal Capacity: cpu: 2 memory: 4044280Ki pods: 110 Allocatable: cpu: 2 memory: 3941880Ki pods: 110 System Info: Machine ID: 6cd3b28d4a1a4d7bbb40a4af3321c603 System UUID: EC2200EF-2BEF-75D2-D5C0-9A2C28BDCF2A Boot ID: d6c494be-c372-4e1d-82e5-cf776277a518 Kernel Version: 4.8.0-26-generic OS Image: Ubuntu 16.10 Operating System: linux Architecture: amd64 Container Runtime Version: docker://Unknown Kubelet Version: v1.7.4 Kube-Proxy Version: v1.7.4 ExternalID: ip-10-0-1-4.us-east-2.compute.internal Non-terminated Pods: (6 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


kube-system kube-apiserver-ip-10-0-1-4.us-east-2.compute.internal 250m (12%) 0 (0%) 0 (0%) 0 (0%) kube-system kube-controller-manager-ip-10-0-1-4.us-east-2.compute.internal 200m (10%) 0 (0%) 0 (0%) 0 (0%) kube-system kube-dns-2425271678-g7w06 260m (13%) 0 (0%) 110Mi (2%) 170Mi (4%) kube-system kube-proxy-lzjgf 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system kube-scheduler-ip-10-0-1-4.us-east-2.compute.internal 100m (5%) 0 (0%) 0 (0%) 0 (0%) kube-system weave-net-h3lhv 20m (1%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits


830m (41%) 0 (0%) 110Mi (2%) 170Mi (4%) Events:

Name: ip-10-0-1-63.us-east-2.compute.internal Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=c4.xlarge beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=us-east-2 failure-domain.beta.kubernetes.io/zone=us-east-2a kubernetes.io/hostname=ip-10-0-1-63.us-east-2.compute.internal Annotations: node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true Taints: CreationTimestamp: Thu, 12 Oct 2017 18:49:06 -0500 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 18:49:06 -0500 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 18:49:06 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 18:49:06 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Fri, 13 Oct 2017 09:58:24 -0500 Thu, 12 Oct 2017 18:49:56 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.0.1.63 InternalDNS: ip-10-0-1-63.us-east-2.compute.internal Hostname: ip-10-0-1-63.us-east-2.compute.internal Capacity: cpu: 4 memory: 7657020Ki pods: 110 Allocatable: cpu: 4 memory: 7554620Ki pods: 110 System Info: Machine ID: ab38f3ee81af4d80a39f295ef8b6f0bc System UUID: EC2E6FDF-C27F-EFB1-D21D-4BEA62EBB585 Boot ID: 54bd8207-997a-4224-8f4b-576f1ed7010b Kernel Version: 4.8.0-59-generic OS Image: Ubuntu 16.10 Operating System: linux Architecture: amd64 Container Runtime Version: docker://Unknown Kubelet Version: v1.7.4 Kube-Proxy Version: v1.7.4 ExternalID: i-0d41ca9ba70817b30 Non-terminated Pods: (20 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


kube-system kube-proxy-684wk 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system kube-state-metrics-764787211-xcf4m 203m (5%) 203m (5%) 136Mi (1%) 136Mi (1%) kube-system logging-agent-3j84r 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system stats-babel-2546172365-z2p3p 20m (0%) 100m (2%) 25Mi (0%) 75Mi (1%) kube-system weave-net-9kbkn 20m (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits


3193m (79%) 7803m (195%) 4832489Ki (63%) 9139247Ki (120%) Events:

Name: ip-10-0-3-128.us-east-2.compute.internal Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=c4.xlarge beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=us-east-2 failure-domain.beta.kubernetes.io/zone=us-east-2c kubernetes.io/hostname=ip-10-0-3-128.us-east-2.compute.internal Annotations: node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true Taints: CreationTimestamp: Thu, 12 Oct 2017 19:57:10 -0500 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Fri, 13 Oct 2017 09:58:23 -0500 Thu, 12 Oct 2017 19:57:10 -0500 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Fri, 13 Oct 2017 09:58:23 -0500 Thu, 12 Oct 2017 19:57:10 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 13 Oct 2017 09:58:23 -0500 Thu, 12 Oct 2017 19:57:10 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Fri, 13 Oct 2017 09:58:23 -0500 Thu, 12 Oct 2017 19:58:00 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.0.3.128 InternalDNS: ip-10-0-3-128.us-east-2.compute.internal Hostname: ip-10-0-3-128.us-east-2.compute.internal Capacity: cpu: 4 memory: 7657024Ki pods: 110 Allocatable: cpu: 4 memory: 7554624Ki pods: 110 System Info: Machine ID: ab38f3ee81af4d80a39f295ef8b6f0bc System UUID: EC25932E-4503-4AF5-51D7-329538B75525 Boot ID: d799aed6-a622-4361-90d3-98b5c00b3515 Kernel Version: 4.8.0-59-generic OS Image: Ubuntu 16.10 Operating System: linux Architecture: amd64 Container Runtime Version: docker://Unknown Kubelet Version: v1.7.4 Kube-Proxy Version: v1.7.4 ExternalID: i-03b4aab0184ca9976 Non-terminated Pods: (25 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


kube-system kube-proxy-0z58f 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system logging-agent-9n24j 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system weave-net-7glcr 20m (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits


2320m (57%) 7 (175%) 1650Mi (22%) 4784Mi (64%) Events:

Name: ip-10-0-3-87.us-east-2.compute.internal Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=c4.xlarge beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=us-east-2 failure-domain.beta.kubernetes.io/zone=us-east-2c kubernetes.io/hostname=ip-10-0-3-87.us-east-2.compute.internal Annotations: node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true Taints: CreationTimestamp: Thu, 12 Oct 2017 18:49:06 -0500 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Fri, 13 Oct 2017 09:58:25 -0500 Thu, 12 Oct 2017 18:49:06 -0500 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Fri, 13 Oct 2017 09:58:25 -0500 Thu, 12 Oct 2017 18:49:06 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 13 Oct 2017 09:58:25 -0500 Thu, 12 Oct 2017 18:49:06 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Fri, 13 Oct 2017 09:58:25 -0500 Thu, 12 Oct 2017 18:49:56 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.0.3.87 InternalDNS: ip-10-0-3-87.us-east-2.compute.internal Hostname: ip-10-0-3-87.us-east-2.compute.internal Capacity: cpu: 4 memory: 7657020Ki pods: 110 Allocatable: cpu: 4 memory: 7554620Ki pods: 110 System Info: Machine ID: ab38f3ee81af4d80a39f295ef8b6f0bc System UUID: EC209443-6549-2B14-8757-ED2D87CDEF4B Boot ID: 95fb661e-2267-4f64-9b14-c4eb7cd0a73d Kernel Version: 4.8.0-59-generic OS Image: Ubuntu 16.10 Operating System: linux Architecture: amd64 Container Runtime Version: docker://Unknown Kubelet Version: v1.7.4 Kube-Proxy Version: v1.7.4 ExternalID: i-027b59c12cc36d317 Non-terminated Pods: (19 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


kube-system kube-proxy-kz09b 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system logging-agent-r72w8 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system weave-net-xw7x0 20m (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits


3170m (79%) 6750m (168%) 4734185Ki (62%) 8747055Ki (115%) Events:

hjacobs commented 7 years ago

Hmm, looks like the label "failure-domain.beta.kubernetes.io/region" is missing on your master node ip-10-0-1-4.us-east-2.compute.internal, maybe Kubernetes failed to connect to AWS API? How do you provision your cluster? Maybe the master node does not have appropriate access to AWS? Can you check in the kubelet logs?

sgmiller commented 7 years ago

Strangely enough, it started working properly overnight. When I check the node list, all of them contain that label.

On Sat, Oct 14, 2017 at 3:38 AM, Henning Jacobs wrote: > Hmm, looks like the label "failure-domain.beta.kubernetes.io/region" is > missing on your master node ip-10-0-1-4.us-east-2.compute.internal, maybe > Kubernetes failed to connect to AWS API? How do you provision your cluster? > Maybe the master node does not have appropriate access to AWS? Can you > check in the kubelet logs? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > , > or mute the thread > > . >