eksctl-io / eksctl

The official CLI for Amazon EKS
https://eksctl.io
Other
4.9k stars 1.4k forks source link

[Bug] vpc.securityGroup validation issue while creating nodegroup #7176

Open hans72118 opened 11 months ago

hans72118 commented 11 months ago

What were you trying to accomplish?

  1. Correctly validates default AWS security egress rule for both IPv4 and IPv6.
  2. An option to allow using restricted cluster outbound rules to fulfill EKS cluster which has to meet some security policies/requirements.

https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html#security-group-restricting-cluster-traffic

Rule type | Protocol | Port | Destination
-- | -- | -- | --
Outbound | TCP | 443 | Cluster security group
Outbound | TCP | 10250 | Cluster security group
Outbound (DNS) | TCP and UDP | 53 | Cluster security group

What happened?

Related to: https://github.com/eksctl-io/eksctl/issues/6455 https://github.com/eksctl-io/eksctl/pull/7030

After eksctl version 0.157.0, security group rule seems to be validated to have default IPv4 egress rule with All Traffic and 0.0.0.0/0. Since a security group created in AWS default has IPv6 and IPv4 egress rule for ::/0 and 0.0.0.0/0, we experienced the following error:

❯ eksctl create nodegroup -f Nodegroup.yaml --dry-run
Error: vpc.securityGroup (sg-009c6a55c3937abcd) has egress rules that were not attached by eksctl; vpc.securityGroup should not contain any non-default external egress rules on a cluster not created by eksctl (rule ID: sgr-02524e9e33210abcd)

Where the egress rules

sg-009c6a55c3937abcd - Outbound rules (2)
---------------------------------------------------------
– sgr-02524e9e33210abcd IPv6    All traffic All All ::/0    –
– sgr-043a6fe0e104aabcd IPv4    All traffic All All 0.0.0.0/0   –

How to reproduce it?

Use a security group with default AWS egress rule as following in vpc.securityGroup to create nodegroup.

sg-009c6a55c3937abcd - Outbound rules (2)
---------------------------------------------------------
– sgr-02524e9e33210abcd IPv6    All traffic All All ::/0    –
– sgr-043a6fe0e104aabcd IPv4    All traffic All All 0.0.0.0/0   –

Nodegroup.yaml

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: LAB-EKS-28
  region: ap-northeast-1
  version: "1.28"

vpc:
  id: "vpc-eaabcdef"
  cidr: "172.31.0.0/16"
  securityGroup: "sg-009c6a55c3937abcd"  ## Additional SG
  subnets:
    public:
      public1:
          id: "subnet-12abcdef"
          az: ap-northeast-1c
      public2:
          id: "subnet-a7abcdef"
          az: ap-northeast-1a
    private:
      private1:
          id: "subnet-00b83dc8b30abcdef"
          az: ap-northeast-1c
      private2:
          id: "subnet-0dd2b34ddd1abcdef"
          az: ap-northeast-1a

managedNodeGroups:
  - name: TEST-28
    instanceType: c6a.large
    desiredCapacity: 2
    minSize: 0
    maxSize: 10
    securityGroups:
      withLocal: false
    ssh:
      allow: true
      publicKeyName: Testing

Logs

Anything else we need to know?

Versions

❯ eksctl version
0.161.0
matthenry87 commented 11 months ago

I'm trying to troubleshoot a squid proxy that I'm using for my Terraform-created EKS cluster. I want to try and add a nodegroup using eksctl because that's worked for me in other VPCs to get a node group that uses squid, but this validation is blocking me.

We should be able to have whatever rules we want in the outbound rules. This validation is a bit of an over-reach imo.

cPu1 commented 11 months ago

@matthenry87, we are planning to relax the validation but need some time to give it more thought. The team is occupied with other major deliverables at the moment. If this is a blocker for you, I'd recommend downgrading to an older version in the meantime.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 9 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 9 months ago

This issue was closed because it has been stalled for 5 days with no activity.

yws-ss commented 5 months ago

Hello Team,

Are this improvement on this year roadmap?

sschamp commented 4 months ago

Please fix this, somehow it deleted my outbound rules from my SG!! Took some time to figure out why my entire Dev cluster was dead.. Had to manually re-add outbound rules for All Traffic on IPv4 and IPv6

ok512 commented 4 months ago

Looks like it's NOT fixed in 0.179.0 The workaround is:

  1. drop outbound IPv6 rule
  2. create node group
  3. add the rule back
TiberiuGC commented 2 months ago

This issue has been scoped down and only applies to self-managed nodegroups now. The long term plan might involve adding the SG rules directly via API, instead of using CFN. More context - https://github.com/eksctl-io/eksctl/issues/6455#issuecomment-1697275161

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.