Open brant4test opened 7 years ago
Then I tried to Install tectonic-1.7.1-tectonic.2 on AWS with Tectonic Installer, stuck at the last step of "Starting Tectonic console" for more than 4 hours. Then I decided to Destroy Custer.
Before destroying stack, I also login and had a little check on master node. The port 32002 is still not listening. Same elb issue.
So my question is, is tectonic-1.7.1-tectonic.2 deploy-able at all?
I'll try to evaluate the earlier version 1.6.8 for the last time.
ip-10-0-22-13 ~ # ps -ef|grep api
root 1742 1726 0 05:47 ? 00:00:00 /usr/bin/flock /var/lock/api-server.lock /hyperkube apiserver --bind-address=0.0.0.0 --secure-port=443 --insecure-port=0 --advertise-address=0.0.0.0 --etcd-servers=https://stack-etcd-0.company.com:2379,https://stack-etcd-1.company.com:2379,https://stack-etcd-2.company.com:2379 --etcd-cafile=/etc/kubernetes/secrets/etcd-client-ca.crt --etcd-certfile=/etc/kubernetes/secrets/etcd-client.crt --etcd-keyfile=/etc/kubernetes/secrets/etcd-client.key --etcd-quorum-read=true --storage-backend=etcd3 --allow-privileged=true --service-cluster-ip-range=10.3.0.0/16 --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota --tls-ca-file=/etc/kubernetes/secrets/ca.crt --tls-cert-file=/etc/kubernetes/secrets/apiserver.crt --tls-private-key-file=/etc/kubernetes/secrets/apiserver.key --kubelet-client-certificate=/etc/kubernetes/secrets/apiserver.crt --kubelet-client-key=/etc/kubernetes/secrets/apiserver.key --service-account-key-file=/etc/kubernetes/secrets/service-account.pub --client-ca-file=/etc/kubernetes/secrets/ca.crt --authorization-mode=RBAC --anonymous-auth=false --oidc-issuer-url=https://stack.company.com/identity --oidc-client-id=tectonic-kubectl --oidc-username-claim=email --oidc-groups-claim=groups --oidc-ca-file=/etc/kubernetes/secrets/ca.crt --cloud-provider=aws --audit-log-path=/var/log/kubernetes/kube-apiserver-audit.log --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100
root 3419 1742 2 05:50 ? 00:01:41 /hyperkube apiserver --bind-address=0.0.0.0 --secure-port=443 --insecure-port=0 --advertise-address=0.0.0.0 --etcd-servers=https://stack-etcd-0.company.com:2379,https://stack-etcd-1.company.com:2379,https://stack-etcd-2.company.com:2379 --etcd-cafile=/etc/kubernetes/secrets/etcd-client-ca.crt --etcd-certfile=/etc/kubernetes/secrets/etcd-client.crt --etcd-keyfile=/etc/kubernetes/secrets/etcd-client.key --etcd-quorum-read=true --storage-backend=etcd3 --allow-privileged=true --service-cluster-ip-range=10.3.0.0/16 --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota --tls-ca-file=/etc/kubernetes/secrets/ca.crt --tls-cert-file=/etc/kubernetes/secrets/apiserver.crt --tls-private-key-file=/etc/kubernetes/secrets/apiserver.key --kubelet-client-certificate=/etc/kubernetes/secrets/apiserver.crt --kubelet-client-key=/etc/kubernetes/secrets/apiserver.key --service-account-key-file=/etc/kubernetes/secrets/service-account.pub --client-ca-file=/etc/kubernetes/secrets/ca.crt --authorization-mode=RBAC --anonymous-auth=false --oidc-issuer-url=https://stack.company.com/identity --oidc-client-id=tectonic-kubectl --oidc-username-claim=email --oidc-groups-claim=groups --oidc-ca-file=/etc/kubernetes/secrets/ca.crt --cloud-provider=aws --audit-log-path=/var/log/kubernetes/kube-apiserver-audit.log --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100
root 5203 5088 0 07:14 pts/1 00:00:00 grep --colour=auto api
ip-10-0-22-13 ~ # ps -ef|grep kube
root 1273 1 3 05:45 ? 00:02:56 /kubelet --kubeconfig=/etc/kubernetes/kubeconfig --require-kubeconfig --cni-conf-dir=/etc/kubernetes/cni/net.d --network-plugin=cni --lock-file=/var/run/lock/kubelet.lock --exit-on-lock-contention --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged --node-labels=node-role.kubernetes.io/master --register-with-taints=node-role.kubernetes.io/master=:NoSchedule --minimum-container-ttl-duration=6m0s --cluster-dns=10.3.0.10 --cluster-domain=cluster.local --client-ca-file=/etc/kubernetes/ca.crt --anonymous-auth=false --cloud-provider=aws
root 1742 1726 0 05:47 ? 00:00:00 /usr/bin/flock /var/lock/api-server.lock /hyperkube apiserver --bind-address=0.0.0.0 --secure-port=443 --insecure-port=0 --advertise-address=0.0.0.0 --etcd-servers=https://stack-etcd-0.company.com:2379,https://stack-etcd-1.company.com:2379,https://stack-etcd-2.company.com:2379 --etcd-cafile=/etc/kubernetes/secrets/etcd-client-ca.crt --etcd-certfile=/etc/kubernetes/secrets/etcd-client.crt --etcd-keyfile=/etc/kubernetes/secrets/etcd-client.key --etcd-quorum-read=true --storage-backend=etcd3 --allow-privileged=true --service-cluster-ip-range=10.3.0.0/16 --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota --tls-ca-file=/etc/kubernetes/secrets/ca.crt --tls-cert-file=/etc/kubernetes/secrets/apiserver.crt --tls-private-key-file=/etc/kubernetes/secrets/apiserver.key --kubelet-client-certificate=/etc/kubernetes/secrets/apiserver.crt --kubelet-client-key=/etc/kubernetes/secrets/apiserver.key --service-account-key-file=/etc/kubernetes/secrets/service-account.pub --client-ca-file=/etc/kubernetes/secrets/ca.crt --authorization-mode=RBAC --anonymous-auth=false --oidc-issuer-url=https://stack.company.com/identity --oidc-client-id=tectonic-kubectl --oidc-username-claim=email --oidc-groups-claim=groups --oidc-ca-file=/etc/kubernetes/secrets/ca.crt --cloud-provider=aws --audit-log-path=/var/log/kubernetes/kube-apiserver-audit.log --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100
root 1836 1817 0 05:47 ? 00:00:16 ./hyperkube proxy --kubeconfig=/etc/kubernetes/kubeconfig --proxy-mode=iptables --hostname-override=ip-10-0-22-13.ec2.internal --cluster-cidr=10.2.0.0/16
root 2053 2037 0 05:47 ? 00:00:02 /opt/bin/flanneld --ip-masq --kube-subnet-mgr --iface=10.0.22.13
nobody 2403 2386 0 05:48 ? 00:00:26 ./hyperkube scheduler --leader-elect=true
nobody 2573 2549 0 05:48 ? 00:00:01 ./hyperkube controller-manager --allocate-node-cidrs=true --configure-cloud-routes=false --cluster-cidr=10.2.0.0/16 --root-ca-file=/etc/kubernetes/secrets/ca.crt --service-account-private-key-file=/etc/kubernetes/secrets/service-account.key --leader-elect=true --node-monitor-grace-period=2m --pod-eviction-timeout=220s --cloud-provider=aws
root 2709 2693 0 05:48 ? 00:00:02 /kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
nobody 2890 2874 0 05:48 ? 00:00:04 /sidecar --v=2 --logtostderr --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
root 3201 3199 0 05:49 ? 00:00:01 stage1/rootfs/usr/lib/ld-linux-x86-64.so.2 stage1/rootfs/usr/bin/systemd-nspawn --boot --notify-ready=yes -Zsystem_u:system_r:svirt_lxc_net_t:s0:c720,c764 -Lsystem_u:object_r:svirt_lxc_file_t:s0:c720,c764 --register=true --link-journal=try-guest --quiet --uuid=7bd366cc-507c-4a1c-961a-dc2151770cc2 --machine=rkt-7bd366cc-507c-4a1c-961a-dc2151770cc2 --directory=stage1/rootfs --bind=/opt/tectonic:/opt/stage2/hyperkube/rootfs/assets:rbind --capability=CAP_AUDIT_WRITE,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FSETID,CAP_FOWNER,CAP_KILL,CAP_MKNOD,CAP_NET_RAW,CAP_NET_BIND_SERVICE,CAP_SETUID,CAP_SETGID,CAP_SETPCAP,CAP_SETFCAP,CAP_SYS_CHROOT -- --default-standard-output=tty --log-target=null --show-status=0
root 3220 3213 0 05:49 ? 00:00:01 /bin/bash /assets/tectonic.sh /assets/auth/kubeconfig /assets false
root 3419 1742 2 05:50 ? 00:01:42 /hyperkube apiserver --bind-address=0.0.0.0 --secure-port=443 --insecure-port=0 --advertise-address=0.0.0.0 --etcd-servers=https://stack-etcd-0.company.com:2379,https://stack-etcd-1.company.com:2379,https://stack-etcd-2.company.com:2379 --etcd-cafile=/etc/kubernetes/secrets/etcd-client-ca.crt --etcd-certfile=/etc/kubernetes/secrets/etcd-client.crt --etcd-keyfile=/etc/kubernetes/secrets/etcd-client.key --etcd-quorum-read=true --storage-backend=etcd3 --allow-privileged=true --service-cluster-ip-range=10.3.0.0/16 --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota --tls-ca-file=/etc/kubernetes/secrets/ca.crt --tls-cert-file=/etc/kubernetes/secrets/apiserver.crt --tls-private-key-file=/etc/kubernetes/secrets/apiserver.key --kubelet-client-certificate=/etc/kubernetes/secrets/apiserver.crt --kubelet-client-key=/etc/kubernetes/secrets/apiserver.key --service-account-key-file=/etc/kubernetes/secrets/service-account.pub --client-ca-file=/etc/kubernetes/secrets/ca.crt --authorization-mode=RBAC --anonymous-auth=false --oidc-issuer-url=https://stack.company.com/identity --oidc-client-id=tectonic-kubectl --oidc-username-claim=email --oidc-groups-claim=groups --oidc-ca-file=/etc/kubernetes/secrets/ca.crt --cloud-provider=aws --audit-log-path=/var/log/kubernetes/kube-apiserver-audit.log --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100
root 5345 5088 0 07:14 pts/1 00:00:00 grep --colour=auto kube
ip-10-0-22-13 ~ #
ip-10-0-22-13 ~ # ps -ef|grep 32002
root 5730 5088 0 07:15 pts/1 00:00:00 grep --colour=auto 32002
ip-10-0-22-13 ~ # netstat -anp|grep 32002
I believe this might be related to this issue: https://github.com/coreos/tectonic-installer/issues/1786
I have same issue (ELB not passing health checks on 32002) with installing 1.6.10 installer and 1.7.3 installer. In fact i ssh to master node and port 32002 is not listening for any requests. The kubernetes master services were running (api-controller, proxy and scheduler) via a hyper kube docker container but was not able to make a connection to 32002. Also my installation is with a new VPC
I am experiencing the same with 1.7.5 and with a new VPC given this case and @erie149's it might not be related to subnet tagging of existing #1786
In my case the variable that I can control is the base domain name:
domain name | gui | tf |
---|---|---|
domain.com | works | works |
sub.domain.com | works | broken |
The R53 hosted zone, domain.com has a recordset 'sub' of NS records to:
The public hosted zone, sub.domain.com has four records, the SOA and NS as well as:
name | type | value |
---|---|---|
dev-api | A | ELB |
dev | A | ELB |
The private hosted zone, sub.domain.com has seven records, the SOA and NS (different than above) and:
name | type | value | |
---|---|---|---|
_etcd-client-ssl._tcp | SRV | 0 0 2379 | dev-etcd-0.sub.domain.com |
_etcd-server-ssl._tcp | SRV | 0 0 2380 | dev-etcd-0.sub.domain.com |
dev-api | A | ELB | |
dev | A | ELB | |
dev-etcd-0 | A | 10.0.x.x |
All three load balancers' have 0 instances in service and the instances are 'OutOfService' just as described by @brant4test :
$ kubectl cluster-info
Kubernetes master is running at https://dev-api.sub.domain.com:443
$ kubectl describe svc
Unable to connect to the server: EOF
$ kubectl cluster-info dump
Unable to connect to the server: EOF
Hi, Team, I installed tectonic-1.7.1-tectonic.2.tar.gz on AWS with Terraform without any errors. but cannot access console and $ kubectl cluster-info Unable to connect to the server: EOF
After checking the aws elb, I got these: 3 masters in Tectonic console ELB are all OutOfService.
After login master nodes, seems like no kube* running and one failed unit $ ssh core@master Last login: Thu Aug 24 03:53:05 UTC 2017 from master on pts/0 Container Linux by CoreOS stable (1465.6.0) Update Strategy: No Reboots Failed Units: 1 init-assets.service core@ip-10-0-42-30 ~ $ journalctl -u init-assets.service -- Logs begin at Thu 2017-08-24 02:46:58 UTC, end at Thu 2017-08-24 03:54:45 UTC. -- Aug 24 02:47:06 localhost systemd[1]: Starting Download Tectonic Assets... Aug 24 02:47:12 ip-10-0-42-30 bash[743]: pubkey: prefix: "quay.io/coreos/awscli" Aug 24 02:47:12 ip-10-0-42-30 bash[743]: key: "https://quay.io/aci-signing-key" Aug 24 02:47:12 ip-10-0-42-30 bash[743]: gpg key fingerprint is: BFF3 13CD AA56 0B16 A898 7B8F 72AB F5F6 799D 33BC Aug 24 02:47:12 ip-10-0-42-30 bash[743]: Quay.io ACI Converter (ACI conversion signing key) support@quay.io Aug 24 02:47:12 ip-10-0-42-30 bash[743]: Trusting "https://quay.io/aci-signing-key" for prefix "quay.io/coreos/awscli" without fingerprint review. Aug 24 02:47:12 ip-10-0-42-30 bash[743]: Added key for prefix "quay.io/coreos/awscli" at "/etc/rkt/trustedkeys/prefix.d/quay.io/coreos/awscli/bff313cdaa560b16a8987b8f72abf5f Aug 24 02:47:12 ip-10-0-42-30 bash[743]: Downloading signature: 0 B/473 B Aug 24 02:47:12 ip-10-0-42-30 bash[743]: Downloading signature: 473 B/473 B Aug 24 02:47:12 ip-10-0-42-30 bash[743]: Downloading signature: 473 B/473 B Aug 24 02:47:22 ip-10-0-42-30 bash[743]: run: Get https://quay-registry.s3.amazonaws.com/sharedimages/3d9c65f1-d97d-4a81-8318-226dd41b9a75/layer?Signature=IwzGK5LfPeMm8BzmnJ Aug 24 02:47:22 ip-10-0-42-30 systemd[1]: init-assets.service: Main process exited, code=exited, status=254/n/a Aug 24 02:47:22 ip-10-0-42-30 systemd[1]: Failed to start Download Tectonic Assets. Aug 24 02:47:22 ip-10-0-42-30 systemd[1]: init-assets.service: Unit entered failed state. Aug 24 02:47:22 ip-10-0-42-30 systemd[1]: init-assets.service: Failed with result 'exit-code'.
ip-10-0-42-30 ~ # netstat -anp|grep 32002 ip-10-0-42-30 ~ # ps -ef|grep api root 1347 1321 0 03:38 pts/0 00:00:00 grep --colour=auto api ip-10-0-42-30 ~ # ps -ef|grep kube root 1484 1473 0 03:53 pts/0 00:00:00 grep --colour=auto kube
What did i miss? Any tips? Thanks! 3 x etcd 3 x master 4 x slaves
FYI, What i've done $ terraform plan -var-file=build/${CLUSTER}/terraform.tfvars platforms/aws