aws-samples / eks-workshop

AWS Workshop for Learning EKS
https://eksworkshop.com
MIT No Attribution
803 stars 1.24k forks source link

K8S Security Group tagging issue - cannot provisioning Amazon Elastic LoadBalancer #534

Closed blackdog0403 closed 1 year ago

blackdog0403 commented 4 years ago

Region: oregon us-west-2

I created eksctl with commands below around 1PM 11/19/2019 GMT+9 Seoul Following instructions https://eksworkshop.com/eksctl/launcheks/

$ eksctl version
[i] version.Info{BuiltAt:"", GitCommit:"", GitTag:"0.10.1"}\

$ eksctl create cluster --name=eksworkshop-eksctl --nodes=3 --alb-ingress-access --region=${AWS_REGION}

Everything looked fine but when I tried to create 'Load Balancer' Type service with command below then found some problem

$ kubectl create -f kubernetes/service.yaml

Then I got this error below

service-controller Error creating load balancer (will retry): failed to ensure load balancer for service default/ecsdemo-frontend: Multiple tagged security groups found for instance i-06feb3a23b2c4f021; ensure only the k8s security group is tagged; the tagged groups were sg-062acb768a48b2f5f(eksctl-eksworkshop-eksctl-nodegroup-ng-1123f59d-SG-16Q4workshop: sg-0a7f0388ff0382c1b(eks-cluster-sg-eksworkshop-eksctl-1769733817)

And Service cannot create ELB but staying with pending status for ExternalIP

So that I deleted tag on sg-0a7f0388ff0382c1b(eks-cluster-sg-eksworkshop-eksctl-1769733817) by myself then service created elb after recreating service.

Tags were like this

Name: eks-cluster-sg-eksworkshop-eksctl-1769733817 
kubernetes.io/cluster/eksworkshop-eksctl : owned

it looks some default setup has been changed for eksctl. I built another cluster with eksclt version.Info{BuiltAt:"", GitCommit:"", GitTag:"0.8.0"} and tag look like this

kubernetes.io/cluster/eksworkshop-eksctl : owned

Do I have to report this issue to eksctl?

carlosafonso commented 4 years ago

Can confirm, I came across this issue as well.

Edit: as @blackdog0403 mentioned, this might be related to the recent eksctl release 0.10.0.

matwerber1 commented 4 years ago

+1, same issue. Given that one of the security groups has "nodegroup" in the name, I believe this is related to yesterday's launch of EKS managed worker nodes? https://aws.amazon.com/blogs/containers/eks-managed-node-groups/

I'm new to K8s / EKS, where / how can we edit the security group assignments?

brentley commented 4 years ago

Thanks. I've pushed a change to pin eksctl to 0.9.0 until the problem is fixed upstream

rtripat commented 4 years ago

@brentley Looks like this is broken with latest version of eksctl and "unmanaged" nodes. It works fine if we use EKS Managed nodes since eksctl associates exactly one SecurityGroup (clusterSecurityGroupId) to an instance. Created a eksctl bug to get this fixed upstrea,

$ eksctl create cluster --name=eksworkshop-eksctl --nodes=3 --alb-ingress-access --managed=true

$ k get nodes
NAME                                          STATUS   ROLES    AGE   VERSION
ip-192-168-30-64.us-west-2.compute.internal   Ready    <none>   67m   v1.14.7-eks-1861c5
ip-192-168-48-5.us-west-2.compute.internal    Ready    <none>   67m   v1.14.7-eks-1861c5
ip-192-168-68-40.us-west-2.compute.internal   Ready    <none>   67m   v1.14.7-eks-1861c5

$ aws ec2 describe-instances --filters Name=private-dns-name,Values=ip-192-168-30-64.us-west-2.compute.internal --query 'Reservations[*].Instances[*].[SecurityGroups]' --output text
sg-021c6b52ad0ec8f75    eks-cluster-sg-eksworkshop-eksctl-1769733817
$ eksctl create cluster --name=eksworkshop-eksctl-unmanaged --nodes=3 --alb-ingress-access

$ k get nodes
NAME                                           STATUS   ROLES    AGE   VERSION
ip-192-168-3-116.us-west-2.compute.internal    Ready    <none>   16m   v1.14.7-eks-1861c5
ip-192-168-55-142.us-west-2.compute.internal   Ready    <none>   16m   v1.14.7-eks-1861c5
ip-192-168-66-144.us-west-2.compute.internal   Ready    <none>   16m   v1.14.7-eks-1861c5

$ aws ec2 describe-instances --filters Name=private-dns-name,Values=ip-192-168-3-116.us-west-2.compute.internal --query 'Reservations[*].Instances[*].[SecurityGroups]' --output text
sg-040c1508847c1bc69    eksctl-eksworkshop-eksctl-unmanaged-nodegroup-ng-08b217d0-SG-1W88EBXXP7DY9
sg-02e5792d2a2b628a9    eksctl-eksworkshop-eksctl-unmanaged-cluster-ClusterSharedNodeSecurityGroup-47J7MCHQUO89
sg-0a1244396e483e71f    eks-cluster-sg-eksworkshop-eksctl-unmanaged-94484000
montanaflynn commented 4 years ago

I'm hitting this as well.

77s         Warning   CreatingLoadBalancerFailed   Service               Error creating load balancer (will retry): failed to ensure load balancer for service default/REDACTED: Multiple tagged security groups found for instance i-REDACTED; ensure only the k8s security group is tagged; the tagged groups were sg-REDACTED(instance-REDACTED-default-REDACTED-REDACTED) sg-REDACTED(instance-REDACTED-default-REDACTED-REDACTED) sg-REDACTED(eksctl-production-nodegroup-REDACTED-m5a-xlarge-ng-SG-REDACTED) sg-REDACTED(instance-REDACTED-default-ingress-REDACTED)

Any way to fix this with an existing cluster?

fmedery commented 4 years ago

Hello @montanaflynn , Do you still have the problem with the latest version of eksctl ?

montanaflynn commented 4 years ago

@fmedery how do you mean? I have an existing cluster created with an older version of eksctl that is facing the issue. Is there anyway to fix the existing cluster with the new eksctl version? I was thinking something along the lines of creating a new nodegroup and removing the old one but I'm not sure if that would fix the issue. I would really rather not create an entirely new cluster.

joshkurz commented 4 years ago

running into this same issue with

eksctl version
[ℹ]  version.Info{BuiltAt:"", GitCommit:"", GitTag:"0.13.0"}

The security group created for the unmanaged node group has the owned tag in it, causing k8s to not allow it to create a LB.

Also can't seem to edit the tags via the console at least because AWS wont let me save the tags because of the cloudformation tags that already exist.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 14 days since being marked as stale.