oracle / terraform-kubernetes-installer

Terraform Installer for Kubernetes on Oracle Cloud Infrastructure
Other
179 stars 118 forks source link

pod cluster network does not work when number of worker nodes > 1 #194

Open doschkinow opened 6 years ago

doschkinow commented 6 years ago

Terraform Version

[pdos@ol7 terraform-kubernetes-installer]$ terraform -v Terraform v0.11.3

OCI Provider Version

[pdos@ol7 terraform-kubernetes-installer]$ ls -l terraform.d/plugins/linux_amd64/terraform-provider-oci_v2.1.4 -rwxr-xr-x. 1 pdos pdos 28835846 Apr 10 09:33 terraform.d/plugins/linux_amd64/terraform-provider-oci_v2.1.4

Terraform Installer for Kubernetes Version

v.1.3.0

Input Variables

[pdos@ol7 terraform-kubernetes-installer]$ cat terraform.tfvars

OCI authentication

region = "us-ashburn-1" tenancy_ocid = "ocid1.tenancy.oc1..aaaaaaaa4jaw55rds22u6yaiy5fxt5qxjr2ja4l5fzkv4hci7kwmexv3hpqq" compartment_ocid = "ocid1.compartment.oc1..aaaaaaaakvhehb5u7nrupwuunhefoedpbegvbnysvz5pdfluxt5wxl5aquwa" fingerprint = "15:fd:5a:0f:7b:f7:c8:d0:82:f5:20:f8:97:07:42:02" private_key_path = "/home/pdos/.oci/oci_api_key.pem" user_ocid = "ocid1.user.oc1..aaaaaaaai3a6zzhjw23wncjhk5ogvjmk4x22zsws6xn4ydmzzlxoo6rthxya"

tenancy_ocid = "ocid1.tenancy.oc1..aaaaaaaa763cu5f3m7qpzwnvr2shs3o26ftrn7fkgz55cpzgxmglgtui3v7q"

compartment_ocid = "ocid1.compartment.oc1..aaaaaaaaidy3jl7bdmiwfryo6myhdnujcuug5zxzoclsz7vpfzw4bggng7iq"

fingerprint = "ed:51:83:3b:d2:04:f4:af:9d:7b:17:96:dd:8a:99:bc"

private_key_path = "/tmp/oci_api_key.pem"

user_ocid = "ocid1.user.oc1..aaaaaaaa5fy2l5aki6z2bzff5yrrmlahiif44vzodeetygxmpulq3mbnckya"

CCM user

cloud_controller_user_ocid = "ocid1.tenancy.oc1..aaaaaaaa763cu5f3m7qpzwnvr2shs3o26ftrn7fkgz55cpzgxmglgtui3v7q"

cloud_controller_user_fingerprint = "ed:51:83:3b:d2:04:f4:af:9d:7b:17:96:dd:8a:99:bc"

cloud_controller_user_private_key_path = "/tmp/oci_api_key.pem"

etcdShape = "VM.Standard1.1" k8sMasterShape = "VM.Standard1.1" k8sWorkerShape = "VM.Standard2.1"

etcdAd1Count = "0" etcdAd2Count = "0" etcdAd3Count = "1"

k8sMasterAd1Count = "0" k8sMasterAd2Count = "0" k8sMasterAd3Count = "1"

k8sWorkerAd1Count = "0" k8sWorkerAd2Count = "1" k8sWorkerAd3Count = "1"

etcdLBShape = "400Mbps" k8sMasterLBShape = "400Mbps"

etcd_ssh_ingress = "10.0.0.0/16"

etcd_ssh_ingress = "0.0.0.0/0"

etcd_cluster_ingress = "10.0.0.0/16"

master_ssh_ingress = "0.0.0.0/0" worker_ssh_ingress = "0.0.0.0/0" master_https_ingress = "0.0.0.0/0" worker_nodeport_ingress = "0.0.0.0/0"

worker_nodeport_ingress = "10.0.0.0/16"

control_plane_subnet_access = "public" k8s_master_lb_access = "public"

natInstanceShape = "VM.Standard1.2"

nat_instance_ad1_enabled = "true"

nat_instance_ad2_enabled = "false"

nat_instance_ad3_enabled = "true"

nat_ssh_ingress = "0.0.0.0/0"

public_subnet_http_ingress = "0.0.0.0/0" public_subnet_https_ingress = "0.0.0.0/0"

worker_iscsi_volume_create is a bool not a string

worker_iscsi_volume_create = true

worker_iscsi_volume_size = 100

etcd_iscsi_volume_create = true

etcd_iscsi_volume_size = 50

Description of issue:

pods deployed to a worker node, different to the node where the dns-pod is deployed, are not able to resolve kubernetes.default. This is true even if the worker nodes are in the same availability domain.

Steps to reproduce:

Below is a log of kubectl commands which illustrate this: [opc@k8s-master-ad3-0 ~]$ kubectl -n kube-system get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE kube-apiserver-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com kube-controller-manager-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com kube-dns-596797cd48-lghdb 3/3 Running 0 5m 10.99.78.2 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com kube-proxy-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com kube-proxy-k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com 1/1 Running 0 3m 10.0.42.3 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com kube-proxy-k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com 1/1 Running 0 3m 10.0.42.2 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com kube-scheduler-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com kubernetes-dashboard-796487df76-d8q7f 1/1 Running 0 5m 10.99.69.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com oci-cloud-controller-manager-sdqv5 1/1 Running 0 5m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com oci-volume-provisioner-66f47d7fcf-ks6pk 1/1 Running 0 5m 10.99.17.2 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com

[opc@k8s-master-ad3-0 ~]$ kubectl scale deployment busybox --replicas=2 deployment "busybox" scaled [opc@k8s-master-ad3-0 ~]$ kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE busybox-56b5f5cd9d-6brvj 1/1 Running 0 6s 10.99.78.4 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com busybox-56b5f5cd9d-lvz4z 1/1 Running 1 13m 10.99.17.5 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com nginx-7cbc4b4d9c-7z772 1/1 Running 0 15m 10.99.17.3 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com nginx-7cbc4b4d9c-c6lrj 1/1 Running 0 15m 10.99.78.3 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com nginx-7cbc4b4d9c-k2kjr 1/1 Running 0 15m 10.99.17.4 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com [opc@k8s-master-ad3-0 ~]$ kubectl exec -it busybox-56b5f5cd9d-6brvj nslookup kubernetes.default Server: 10.21.21.21 Address 1: 10.21.21.21 kube-dns.kube-system.svc.cluster.local

Name: kubernetes.default Address 1: 10.21.0.1 kubernetes.default.svc.cluster.local [opc@k8s-master-ad3-0 ~]$ kubectl exec -it busybox-56b5f5cd9d-lvz4z nslookup kubernetes.default Server: 10.21.21.21 Address 1: 10.21.21.21

nslookup: can't resolve 'kubernetes.default' command terminated with exit code 1