Bottlerocket fails to find DNS name in private subnet

robd003 commented 1 year ago

Image I'm using: Bottlerocket OS 1.13.4 (aws-k8s-1.26)

What I expected to happen: AWS instance has two subnets, a public subnet and a private subnet (without a NAT gateway)

What actually happened:

[   36.677435] sundog[1152]: Setting generator 'pluto private-dns-name' failed with exit code 1 - stderr: Error describing instance 'i-0c59419f023b394ba': dispatch failure: timeout: error trying to connect: HTTP connect timeout occurred after 3.1s: HTTP connect timeout occurred after 3.1s: timed out (DispatchFailure(DispatchFailure { source: ConnectorError { kind: Timeout, source: hyper::Error(Connect, HttpTimeoutError { kind: "HTTP connect", duration: 3.1s }) } }))
[FAILED] Failed to start User-specified setting generators.
See 'systemctl status sundog.service' for details.

How to reproduce the problem: Try to launch an EKS cluster with Bottlerocket for EKS 1.26 with two subnets, one public and one private.

DHCP options for the subnets:

Example eksctl config:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: why-wont-this-work
  region: us-east-2
  version: '1.26'

vpc:
  clusterEndpoints:
    publicAccess:  true
    privateAccess: true
  id: "vpc-caf3bab3"
  subnets:
    public:
      pub-us-east-2a:
        id: "subnet-0000111"
    private:
      priv-us-east-2a:
        id: "subnet-0000222"

iam:
  withOIDC: true

nodeGroups:
  - name: east2a
    instanceType: m7g.xlarge
    amiFamily: Bottlerocket
    privateNetworking: true
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess
        - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
      withAddonPolicies:
        autoScaler: true
        cloudWatch: false
        externalDNS: true
    maxSize: 4
    subnets:
      - pub-us-east-2a
      - priv-us-east-2a
    labels:
      env: prod
    desiredCapacity: 1

Full boot log: https://gist.github.com/robd003/7f05a5f76bf241f047a99ab3135f6a03

jpmcb commented 1 year ago

Hi @robd003 - thanks for opening this issue. I'm creating a cluster now and attempting to reproduce your issue.

Can you share which version of eksctl you have?

I'm using:

❯ eksctl version
0.138.0-rc.0

I'm noticing in your config that you have your Nodegroups privateNetworking enabled and set to true. But your subnet configuration is:

    subnets:
      - pub-us-east-2a
      - priv-us-east-2a

which makes me think that your Bottlerocket node is failing to reach that public enpoint since you have only private networking enabled.

From the eksctl docs:

When placing nodegroups inside a private subnet, privateNetworking must be set to true on the nodegroup

So I'm unsure if that is supported. This example makes me think that you might need a separate node-group to enable the public access.

I'll be back in a few minutes with some results!

jpmcb commented 1 year ago

I'm getting the following error when attempting to use 1 public and 1 private subnet from your example:

❯ eksctl create cluster -f 3064-repro.yaml
2023-05-01 16:44:43 [ℹ]  eksctl version 0.138.0-rc.0
2023-05-01 16:44:43 [ℹ]  using region us-west-2
2023-05-01 16:44:43 [✖]  unable to use given VPC (vpc-xxx) and subnets (private:map[private-subnet:{subnet-xxx us-west-2a 192.168.128.0/19 0 }] public:map[public-subnet:{subnet-xxx us-west-2a 192.168.0.0/19 0 }])
Error: insufficient number of subnets, at least 2x public and/or 2x private subnets are required

Have you been able to reproduce this with the given cluster config? I'll keep trying with another private subnet in the vpc field

robd003 commented 1 year ago

I'm using eksctl 0.139.0

I was unable to add the private subnet unless I had privateNetworking set to true for the nodegroup.

The part that confused me is that the EKS cluster has both private and public access, so you would think that the nodes would be able to connect regardless.

The main issue I was seeing was that Bottlerocket was unable to get its private DNS name via pluto autodiscovery. Did you also see that error in the "Get Console Log" on the node instances?

robd003 commented 1 year ago

I'm getting the following error when attempting to use 1 public and 1 private subnet from your example:
❯ eksctl create cluster -f 3064-repro.yaml

2023-05-01 16:44:43 [ℹ]  eksctl version 0.138.0-rc.0

2023-05-01 16:44:43 [ℹ]  using region us-west-2

2023-05-01 16:44:43 [✖]  unable to use given VPC (vpc-xxx) and subnets (private:map[private-subnet:{subnet-xxx us-west-2a 192.168.128.0/19 0 }] public:map[public-subnet:{subnet-xxx us-west-2a 192.168.0.0/19 0 }])

Error: insufficient number of subnets, at least 2x public and/or 2x private subnets are required
Have you been able to reproduce this with the given cluster config? I'll keep trying with another private subnet in the vpc field

I'm using an existing VPC and subnets, so I get past that part of eksctl.

In the example I pasted I just cut it down to a single AZ for the sake of brevity. Try defining 2+ AZs and it should work.

jpmcb commented 1 year ago

I was able to reproduce the issue:

Here's my eksctl cluster config:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: 3064-repro
  region: us-west-2
  version: '1.26'

vpc:
  clusterEndpoints:
    publicAccess:  true
    privateAccess: true
  id: "vpc-xxx"
  subnets:
    public:
      public-subnet:
        id: "subnet-xxx"   # In AZ us-west-2a with CIDR - 192.168.0.0/19
                           # and has internet gateway / route table attached
      another-public-subnet:
        id: "subnet-xxx"   # In AZ us-west-2b with CIRD - 192.168.64.0/19
                           # and has internet gateway / route table attached
    private:
      private-subnet:
        id: "subnet-xxx"   # In AZ us-west-2a with CIDR - 192.168.128.0/19
      another-private-subnet:
        id: "subnet-xxx"   # In AZ us-west-2b with CIDR - 192.168.64.0/19

iam:
  withOIDC: true

nodeGroups:
  - name: test-nodegroup-3064
    instanceType: m7g.xlarge
    amiFamily: Bottlerocket
    privateNetworking: true
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess
        - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
      withAddonPolicies:
        autoScaler: true
        cloudWatch: false
        externalDNS: true
    maxSize: 4
    subnets:
      - public-subnet
      - private-subnet
    labels:
      env: prod
    desiredCapacity: 1

I had provision and specify 2 existing private and public subnets to get eksctl to use my custom, existing vpc.

Here's the run of eksctl:

❯ eksctl create cluster -f 3064-repro.yaml
2023-05-01 17:00:06 [ℹ]  eksctl version 0.138.0-rc.0
2023-05-01 17:00:06 [ℹ]  using region us-west-2
2023-05-01 17:00:07 [✔]  using existing VPC (vpc-xxx) and subnets (private:map[another-private-subnet:{subnet-xxx us-west-2b 192.168.192.0/19 0 } private-subnet:{subnet-xxx us-west-2a 192.168.128.0/19 0 }] public:map[another-public-subnet:{subnet-xxx us-west-2b 192.168.64.0/19 0 } public-subnet:{subnet-xxx us-west-2a 192.168.0.0/19 0 }])
2023-05-01 17:00:07 [!]  custom VPC/subnets will be used; if resulting cluster doesn't function as expected, make sure to review the configuration of VPC/subnets
2023-05-01 17:00:07 [ℹ]  nodegroup "test-nodegroup-3064" will use "ami-03afaac8605e281d8" [Bottlerocket/1.26]
2023-05-01 17:00:07 [ℹ]  using Kubernetes version 1.26
2023-05-01 17:00:07 [ℹ]  creating EKS cluster "3064-repro" in "us-west-2" region with un-managed nodes
2023-05-01 17:00:07 [ℹ]  1 nodegroup (test-nodegroup-3064) was included (based on the include/exclude rules)
2023-05-01 17:00:07 [ℹ]  will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
2023-05-01 17:00:07 [ℹ]  will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
2023-05-01 17:00:07 [ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=3064-repro'
2023-05-01 17:00:07 [ℹ]  Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "3064-repro" in "us-west-2"
2023-05-01 17:00:07 [ℹ]  CloudWatch logging will not be enabled for cluster "3064-repro" in "us-west-2"
2023-05-01 17:00:07 [ℹ]  you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=us-west-2 --cluster=3064-repro'
2023-05-01 17:00:07 [ℹ]
2 sequential tasks: { create cluster control plane "3064-repro",
    2 sequential sub-tasks: {
        4 sequential sub-tasks: {
            wait for control plane to become ready,
            associate IAM OIDC provider,
            2 sequential sub-tasks: {
                create IAM role for serviceaccount "kube-system/aws-node",
                create serviceaccount "kube-system/aws-node",
            },
            restart daemonset "kube-system/aws-node",
        },
        create nodegroup "test-nodegroup-3064",
    }
}
2023-05-01 17:00:07 [ℹ]  building cluster stack "eksctl-3064-repro-cluster"
2023-05-01 17:00:08 [ℹ]  deploying stack "eksctl-3064-repro-cluster"
2023-05-01 17:00:38 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:01:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:02:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:03:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:04:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:05:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:06:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:07:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:08:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:09:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:10:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:11:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:12:08 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-cluster"
2023-05-01 17:14:09 [ℹ]  building iamserviceaccount stack "eksctl-3064-repro-addon-iamserviceaccount-kube-system-aws-node"
2023-05-01 17:14:09 [ℹ]  deploying stack "eksctl-3064-repro-addon-iamserviceaccount-kube-system-aws-node"
2023-05-01 17:14:09 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-addon-iamserviceaccount-kube-system-aws-node"
2023-05-01 17:14:39 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-addon-iamserviceaccount-kube-system-aws-node"
2023-05-01 17:14:39 [ℹ]  serviceaccount "kube-system/aws-node" already exists
2023-05-01 17:14:39 [ℹ]  updated serviceaccount "kube-system/aws-node"
2023-05-01 17:14:40 [ℹ]  daemonset "kube-system/aws-node" restarted
2023-05-01 17:14:40 [ℹ]  building nodegroup stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:14:40 [ℹ]  --nodes-min=1 was set automatically for nodegroup test-nodegroup-3064
2023-05-01 17:14:40 [!]  public subnet public-subnet is being used with `privateNetworking` enabled, please ensure this is the desired behaviour
2023-05-01 17:14:40 [ℹ]  deploying stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:14:40 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:15:10 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:15:48 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:17:40 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:19:16 [ℹ]  waiting for CloudFormation stack "eksctl-3064-repro-nodegroup-test-nodegroup-3064"
2023-05-01 17:19:16 [ℹ]  waiting for the control plane to become ready
2023-05-01 17:19:16 [!]  failed to determine authenticator version, leaving API version as default v1alpha1: failed to parse versions: unable to parse first version "unversioned": Invalid character(s) found in major number "unversioned"
2023-05-01 17:19:16 [✔]  saved kubeconfig as "/home/ubuntu/.kube/config"
2023-05-01 17:19:16 [ℹ]  no tasks
2023-05-01 17:19:16 [✔]  all EKS cluster resources for "3064-repro" have been created
2023-05-01 17:19:16 [ℹ]  adding identity "arn:aws:iam::994959692891:role/eksctl-3064-repro-nodegroup-test-NodeInstanceRole-3QFDL9P4KWS" to auth ConfigMap
2023-05-01 17:19:16 [ℹ]  nodegroup "test-nodegroup-3064" has 0 node(s)
2023-05-01 17:19:16 [ℹ]  waiting for at least 1 node(s) to become ready in "test-nodegroup-3064"

It just hangs waiting for the node to come up.

And I see the following failure in the bootlog:

[   27.562077] sundog[1145]: Setting generator 'pluto private-dns-name' failed with exit code 1 - stderr: Error describing instance 'i-0b01925b70255a154': dispatch failure: timeout: error trying to connect: HTTP connect timeout occurred after 3.1s: HTTP connect timeout occurred after 3.1s: timed out (DispatchFailure(DispatchFailure { source: ConnectorError { kind: Timeout, source: hyper::Error(Connect, HttpTimeoutError { kind: "HTTP connect", duration: 3.1s }) } }))
[FAILED] Failed to start User-specified setting generators.
See 'systemctl status sundog.service' for details.
[DEPEND] Dependency failed for Applies settings to create config files.
[DEPEND] Dependency failed for Send signal to CloudFormation Stack.
[DEPEND] Dependency failed for Bottlerocket initial configuration complete.
[DEPEND] Dependency failed for Isolates configured.target.
[DEPEND] Dependency failed for Sets the hostname.

But this warning log from eksctl gives me pause:

2023-05-01 17:14:40 [!]  public subnet public-subnet is being used with `privateNetworking` enabled, please ensure this is the desired behaviour

I also see you opened https://github.com/weaveworks/eksctl/issues/6563. I'm not sure what this privateNetworking setting is doing. We should also confirm any assumptions about that setting with that team.

jpmcb commented 1 year ago

Also going to attempt to reproduce this with a 1.25 cluster since we moved to an in-tree cloud provider for 1.26 and this may be related to that.

jpmcb commented 1 year ago

1.25 with this configuration fails to bring up the kubelet:

         Starting Kubelet...
[  OK  ] Finished Isolates multi-user.target.
[  OK  ] Finished Send boot success.
[FAILED] Failed to start Kubelet.
See 'systemctl status kubelet.service' for details.
[  OK  ] Reached target Multi-User System.

robd003 commented 1 year ago

@jpmcb Are you able to get the logs to see why it failed?

jpmcb commented 1 year ago

I'm having trouble getting the ssm agent to connect - I've added the necessary endpoints in the private subnet but I'm wondering if the bottlerocket network configuration selects the wrong interface and gets a DHCP lease that can't hit those endpoints.

Any thoughts @zmrow or @yeazelm ?

svyatoslavmo commented 1 year ago

@jpmcb Hi! Just found this issue from AWS support case. Is there any ETA for resolving this?

jpmcb commented 1 year ago

Hi @svyatoslavmo - thanks for the update. Can you provide some more detail on what you're attempting to do with a private and public subnet?

This usecase is somewhat abnormal since a private subnet with no gateway will never be able to pull down images from ECR (or another image registry). This is especially relevant for the admin and control containers which are required to perform debugging operations (and is making determining the kublet failure logs difficult)

For example, with both the private and public subnet attached to the node group (and privateNetworking set), these able to pull these logs from the

[  776.392160] host-ctr[1277]: time="2023-05-02T16:23:56Z" level=error msg="retries exhausted: failed to resolve reference \"ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.10.0\": RequestError: send request failed\ncaused by: Post \"https://api.ecr.us-west-2.amazonaws.com/\": dial tcp 52.119.173.252:443: i/o timeout" ref="ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.10.0"
[  776.403281] host-ctr[1277]: time="2023-05-02T16:23:56Z" level=fatal msg="retries exhausted: failed to resolve reference \"ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.10.0\": RequestError: send request failed\ncaused by: Post \"https://api.ecr.us-west-2.amazonaws.com/\": dial tcp 52.119.173.252:443: i/o timeout"
[  779.398958] host-ctr[1278]: time="2023-05-02T16:23:59Z" level=error msg="retries exhausted: failed to resolve reference \"ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-control:v0.7.1\": RequestError: send request failed\ncaused by: Post \"https://api.ecr.us-west-2.amazonaws.com/\": dial tcp 52.119.173.252:443: i/o timeout" ref="ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-control:v0.7.1"

Does this work on different node operating systems?

The more typical use case I've seen is where a kubernetes cluster has separate node groups where a subset is attached to the wider internet, and other groups are segmented away. I'm not sure how the case where a group has both private and public subnets.

jpmcb commented 1 year ago

I thought maybe there's something eksctl is doing with the privateNetworking = true key. So I used the following:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: private-networking
  region: us-west-2
  version: '1.25'

iam:
  withOIDC: true

nodeGroups:
  - name: test-nodegroup
    instanceType: m7g.xlarge
    amiFamily: Bottlerocket
    privateNetworking: true
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess
        - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
    desiredCapacity: 1
    ssh:
      allow: true
      publicKeyName: abc123

which only attaches to the created private subnets: everything comes up fine, including when pulling container images and starting the kubelet. So it must be a symptom of attaching both the private and public subnets.

TiberiuGC commented 1 year ago

privateNetworking usually behaves in the following manner:

(1) - when the subnets are not already existing and user defined, but rather not specified in the config file and hence created by eksctl. Here, for privateNetworking: true, eksctl will only assign private subnets to the nodegroup, and the reverse also applies.

(2) - when the subnets already exist and they are defined by the user, privateNetworking is only used for some validation purposes, but it usually falls under the user's responsibility to only assign private subnets to that nodegroup in order to achieve the desired behaviour. (hence why you're only seeing this warning 2023-05-01 17:14:40 [!] public subnet public-subnet is being used with privateNetworking enabled, please ensure this is the desired behaviour)

@jpmcb your use case falls under scenario (1), the important aspect here not being that you didn't specify public subnets, but rather the fact that the private subnets are eksctl created, which means a NAT gateway is also being created and attached to those subnets.

@robd003 your use case falls under scenario (2). I think the issue was highlighted in one of the above messages - a private subnet with no gateway will never be able to pull down images from ECR (or another image registry). If one of your worker nodes is deployed within the private subnet, you'll run into this problem. I think it's possible to continue specifying both public and private subnets for your nodegroup, as long as the private one has a gateway configured. If you continue specifying both, you can drop the privateNetworking: true flag as you are not really achieving privateNetworking for the entire nodegroup anyways.

jpmcb commented 1 year ago

@robd003 your use case falls under scenario (2). I think the issue was highlighted in one of the above messages - a private subnet with no gateway will never be able to pull down images from ECR (or another image registry). If one of your worker nodes is deployed within the private subnet, you'll run into this problem.

Thanks @TiberiuGC so much for the insight! I thought the solution here should be to specify VPC endpoints, which keep us from having to give the subnet a NAT or internet gateway. In my test environment, I gave my private subnets VPC endpoints to ECR and SSM, made the security groups allow all traffic, but still was not able to get my nodes to hit ECR to pull down the images.

Thought it might be something weird with my subnet, but even attaching a internet gateway to the subnet and deploying a test ubuntu image works fine but the bottlerocket node still can't come up. Any thoughts?

robd003 commented 1 year ago

Thanks for the help guys. I'll try just sticking with a public only cluster for now. Bottlerocket has been great during my testing so far!

TiberiuGC commented 1 year ago

@robd003 your use case falls under scenario (2). I think the issue was highlighted in one of the above messages - a private subnet with no gateway will never be able to pull down images from ECR (or another image registry). If one of your worker nodes is deployed within the private subnet, you'll run into this problem.

Thanks @TiberiuGC so much for the insight! I thought the solution here should be to specify VPC endpoints, which keep us from having to give the subnet a NAT or internet gateway. In my test environment, I gave my private subnets VPC endpoints to ECR and SSM, made the security groups allow all traffic, but still was not able to get my nodes to hit ECR to pull down the images.

Thought it might be something weird with my subnet, but even attaching a internet gateway to the subnet and deploying a test ubuntu image works fine but the bottlerocket node still can't come up. Any thoughts?

Unfortunately nothing that comes to mind instantly ... this may require further investigation

awoimbee commented 1 year ago

1.25 with this configuration fails to bring up the kubelet:

Starting Kubelet...
[  OK  ] Finished Isolates multi-user.target.
[  OK  ] Finished Send boot success.
[FAILED] Failed to start Kubelet.
See 'systemctl status kubelet.service' for details.
[  OK  ] Reached target Multi-User System.

I have the exact same problem (EKS 1.25, amazon/bottlerocket-aws-k8s-1.25-x86_64-v1.13.5-33225cc9), except I create everything using terraform. I would like to remove the NAT (& EIGW) from my private subnets (make them fully offline) for security reasons. But I have the same error as above. SSH connection is impossible (port 22: Connection refused). My VPC endpoints are as such:

resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.default.id
  service_name = "com.amazonaws.${data.aws_region.current.name}.s3"
  route_table_ids = [
    aws_route_table.internet_private.id,
    aws_route_table.internet_public.id
  ]
}
resource "aws_vpc_endpoint" "offline" {
  for_each = toset(["sts", "ecr.dkr", "ec2", "autoscaling", "eks", "ssm"])

  vpc_id            = aws_vpc.default.id
  service_name      = "com.amazonaws.${data.aws_region.current.name}.${each.key}"
  subnet_ids        = local.subnets_private_ids
  vpc_endpoint_type = "Interface"
}

Once the node has joined the cluster, I can remove the NAT (& EIGW) and everything seems to work.

jpmcb commented 1 year ago

@awoimbee - thanks for surfacing this.

That does look similar to the above. Do those subnets have VPC endpoints to ECR to pull down the container images to start the admin/control container? You won't be able to get ssh access unless the admin container can be pulled down and started to start serving ssh clients.

awoimbee commented 1 year ago

Hi, I had multiple mistakes in the above snippet:

bad security group
missing private_dns_enabled
using multiple AZs is not necessary in my case
missing ecr.api endpoint (containers could not be pulled) -> I wonder why dkr and api are 2 separate endpoints if we always need both ?

So, here is my final definition to get truly offline EKS nodes (has been working for me for a week):

resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.default.id
  service_name = "com.amazonaws.${data.aws_region.current.name}.s3"
  route_table_ids = [
    aws_route_table.internet_private.id,
    aws_route_table.internet_public.id
  ]
  tags = {
    Name   = "gw-${var.name}-s3-vpc-endpoint"
    module = local.module
  }
}
resource "aws_security_group" "vpc_endpoint" {
  description = "Security group for VPC endpoints"
  name        = "vpc-endpoint-${var.name}"
  vpc_id      = aws_vpc.default.id
  tags = {
    Name   = "VPCEndpoint"
    module = local.module
  }
  ingress {
    cidr_blocks = ["0.0.0.0/0"]
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
  }
}
resource "aws_vpc_endpoint" "offline" {
  for_each = toset(["sts", "ecr.dkr", "ecr.api", "ec2", "autoscaling", "eks", "ssm"])

  vpc_id              = aws_vpc.default.id
  service_name        = "com.amazonaws.${data.aws_region.current.name}.${each.key}"
  subnet_ids          = [local.subnets_private_ids[0]]
  vpc_endpoint_type   = "Interface"
  private_dns_enabled = true
  security_group_ids  = [aws_security_group.vpc_endpoint.id]
  tags = {
    Name   = "iedp-${var.name}-${each.key}-vpc-endpoint"
    module = local.module
  }
}

patkinson01 commented 1 year ago

Hi All,

We've run into a issue which isn't the exact scenario but has similarities. We have all private subnets, however use a secondary CIDR block for pod IPs.

Cluster with MNG running happily on 1.25. Upgrade to 1.26 runs successfully but on nodegroup upgrade to new AMI, nodes won't join cluster and errors in the instance log:

[ 304.743884] sundog[1345]: Setting generator 'pluto private-dns-name' failed with exit code 1 - stderr: Timed out retrieving private DNS name from EC2: deadline has elapsed [FAILED] Failed to start User-specified setting generators. See 'systemctl status sundog.service' for details. [DEPEND] Dependency failed for Applies settings to create config files. [DEPEND] Dependency failed for Send signal to CloudFormation Stack. [DEPEND] Dependency failed for Bottlerocket initial configuration complete. [DEPEND] Dependency failed for Isolates configured.target. [DEPEND] Dependency failed for Sets the hostname.

We do some CIS hardening using a bootstrap container. Could that potentially be causing an issue or is this error happening before it runs?

dekelummanu commented 1 year ago

+1

hajdukda commented 4 months ago

I've experienced the same issue when my SG was missing 0.0.0.0/0 egress (misconfiguration) to be able to access AWS APIs, for pure private subnets - @awoimbee solution seems to be the best.

bottlerocket-os / bottlerocket

Bottlerocket fails to find DNS name in private subnet #3064