nwcdheap / kops-cn

AWS中国宁夏区域/北京区域,快速Kops部署K8S集群
Apache License 2.0
121 stars 74 forks source link

put the master and nodes to private subnet with NAT gateway validate-cluster failed #94

Closed liangruibupt closed 5 years ago

liangruibupt commented 5 years ago

Put the master and nodes to private subnet with NAT gateway.

Makefile snippet: using existed VPC and subnet, AMI is AmazonLinux2, follow up the https://github.com/kubernetes/kops/issues/4548 to add --utility-subnets=$(SUBNET_IDS) --api-loadbalancer-type=internal --topology=private to create_cluster

# customize the values below
TARGET_REGION ?= cn-north-1
AWS_PROFILE ?= default
KOPS_STATE_STORE ?= s3://ray-kops-state-store-bjs
VPCID ?= vpc-0259f0601bb042d53
MASTER_COUNT ?= 3
MASTER_SIZE ?= m5.large
NODE_SIZE ?= c5.large
NODE_COUNT ?= 2
SSH_PUBLIC_KEY ?= ~/.ssh/kops-cn-bjs.pub
KUBERNETES_VERSION ?= v1.12.8
KOPS_VERSION ?= 1.12.1
SUBNET_IDS ?=subnet-01f7afb87dd26bc9c,subnet-0be82b17cbe107b98
# do not modify following values
AWS_DEFAULT_REGION ?= $(TARGET_REGION)
AWS_REGION ?= $(AWS_DEFAULT_REGION)
ifeq ($(TARGET_REGION) ,cn-north-1)
    CLUSTER_NAME ?= cluster.ray.bjs.k8s.local
    AMI ?= ami-08b835182371dee58
    ZONES ?= cn-north-1a,cn-north-1b
endif

ifeq ($(TARGET_REGION) ,cn-northwest-1)
    CLUSTER_NAME ?= cluster.zhy.k8s.local
    AMI ?= ami-006bc343e8c9c9b22
    ZONES ?= cn-northwest-1a,cn-northwest-1b,cn-northwest-1c
endif

ifdef CUSTOM_CLUSTER_NAME
    CLUSTER_NAME = $(CUSTOM_CLUSTER_NAME)
endif

KUBERNETES_VERSION_URI ?= "https://s3.cn-north-1.amazonaws.com.cn/kubernetes-release/release/$(KUBERNETES_VERSION)"

.PHONY: create-cluster
create-cluster:
    @KOPS_STATE_STORE=$(KOPS_STATE_STORE) \
    AWS_PROFILE=$(AWS_PROFILE) \
    AWS_REGION=$(AWS_REGION) \
    AWS_DEFAULT_REGION=$(AWS_DEFAULT_REGION) \
    kops create cluster \
     --cloud=aws \
     --name=$(CLUSTER_NAME) \
     --image=$(AMI) \
     --zones=$(ZONES) \
     --subnets=$(SUBNET_IDS) \
     --master-count=$(MASTER_COUNT) \
     --master-size=$(MASTER_SIZE) \
     --node-count=$(NODE_COUNT) \
     --node-size=$(NODE_SIZE)  \
     --vpc=$(VPCID) \
     --kubernetes-version=$(KUBERNETES_VERSION_URI) \
     --networking=amazon-vpc-routed-eni \
     --ssh-public-key=$(SSH_PUBLIC_KEY) \
     --utility-subnets=$(SUBNET_IDS) \
     --api-loadbalancer-type=internal \
     --topology=private

keep other parts of Makefile no change

validate-cluster failed with below error:

make validate-cluster
Using cluster from kubectl context: cluster.ray.bjs.k8s.local

Validating cluster cluster.ray.bjs.k8s.local

unexpected error during validation: error listing nodes: Get https://internal-api-cluster-ray-bjs-k8s-l-hihnvi-1334758224.cn-north-1.elb.amazonaws.com.cn/api/v1/nodes: dial tcp 172.16.130.208:443: i/o timeout
make: *** [validate-cluster] Error 1

Already checked the issue, https://github.com/nwcdlabs/kops-cn/issues/5

liangruibupt commented 5 years ago

! update ! The root cause of 'dial tcp 172.16.101.115:443: i/o timeout' is my bastion server (run make validate-cluster) is in different VPC of k8s VPC. I need setup VPC peering or create new bastion server on k8s VPC.

BTW, some times, some ELB instances are not InService more than 15 mins, you can make delete-cluster and re-create cluster again.

liangruibupt commented 5 years ago

Makefile 可以参考issue 里面的样例,其就是就是添加3行 --utility-subnets=$(SUBNET_IDS) \ --api-loadbalancer-type=internal \ --topology=private

ghost commented 5 years ago

@liangruibupt 我看到代码中有SUBNET_IDS ?=subnet-01f7afb87dd26bc9c,subnet-0be82b17cbe107b98,所以这样的话是需要先自己创建subnet,然后指定给kops用吗,而不是跟之前一样让kops帮我们创建subnet?

pahud commented 5 years ago

Kops應該不會主動建private subnets以及NAT,因此 @liangruibupt 提供的範例應該是自己建立的private subnets然後當作arguments帶給Kops使用。