coreos / coreos-kubernetes

CoreOS Container Linux+Kubernetes documentation & Vagrant installers
https://coreos.com/kubernetes/docs/latest/
Apache License 2.0
1.1k stars 466 forks source link

Better EtcdSecurityGroup configuration #805

Open Overbryd opened 7 years ago

Overbryd commented 7 years ago

Hi everyone, I really enjoy this project. Great to work on and so far reliable. Today I found an issue that could be fixed quickly by enabling a bit more intuitive SecurityGroup configurations for the etcd instances.

By default they look like this (taken from stack-template.json generated on Dec 28 by v0.9.3-rc.2):

    ...
    "SecurityGroupEtcdPeerIngress": {
      "Properties": {
        "FromPort": 2380,
        "GroupId": {
          "Ref": "SecurityGroupEtcd"
        },
        "IpProtocol": "tcp",
        "SourceSecurityGroupId": {
          "Ref": "SecurityGroupEtcd"
        },
        "ToPort": 2380
      },
      "Type": "AWS::EC2::SecurityGroupIngress"
    },
    ...
    "SecurityGroupEtcd": {
      "Properties": {
        "GroupDescription": {
          "Ref": "AWS::StackName"
        },
        "SecurityGroupEgress": [
          {
            "CidrIp": "0.0.0.0/0",
            "FromPort": 0,
            "IpProtocol": "tcp",
            "ToPort": 65535
          },
          {
            "CidrIp": "0.0.0.0/0",
            "FromPort": 0,
            "IpProtocol": "udp",
            "ToPort": 65535
          }
        ],
        "SecurityGroupIngress": [
          {
            "CidrIp": "0.0.0.0/0",
            "FromPort": 3,
            "IpProtocol": "icmp",
            "ToPort": -1
          },
          {
            "CidrIp": "0.0.0.0/0",
            "FromPort": 22,
            "IpProtocol": "tcp",
            "ToPort": 22
          }
        ], 
       ...

This completely disables all ICMP packets (outgoing) from the machine.

Also a simple $ etcdctl cluster-health will report an unhealthy cluster, although the cluster is in fact healthy.

I would propose to add following rules for better handling:

From an end-users perspective, I just spent a good hour debugging something that was configured right from the start. So maybe adding some safe, but expected rules to the EtcdSecurityGroup will help. I yet have to get used to the source of kube-aws, I might submit a Pull Request if I am capable of changing it. But if someone has an opinion on this or is quicker than me, I am happy to hear from your.

Kind regards,

Lukas Rieder