Closed dskatz closed 3 years ago
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
/kind bug
1. What
kops
version are you running? The commandkops version
, will display this information.Version 1.20.0 (git-8ea83c6d233a15dacfcc769d4d82bea3f530cf72)
2. What Kubernetes version are you running?
kubectl version
will print the version if a cluster is running or provide the Kubernetes version specified as akops
flag.v1.18.18
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
The commands were executed as described in the ACM-NLB documentation to prep a kops cluster for k8s 1.19 removal of basic auth credentials. (https://github.com/kubernetes/kops/blob/master/permalinks/acm_nlb.md)
For brevity, after the cluster spec was updated with
spec.api.loadBalancer.class: Network.
.kops update cluster --yes
kops delete secret master
. This does not exist, after failing to find documentation on kops secret, I deleted the master keypair with akops delete keypair master
kops rolling-update cluster --instance-group-roles Master --cloudonly
5. What happened after the commands executed?
The control plane came back, however since the master keypair was deleted which essentially destroys the certificate authority, all service-account tokens need to be re-created. Deleting all the tokens and rolling pods recovered the cluster.
6. What did you expect to happen?
I did not expect that I would need to delete the CA in order to get kops to re-issue the K8s API Sever certificates to simply update the SAN with the new NLB domain name.
I expected this to be somewhat transparent to a kops administrator and simply just require a normal rolling-update of the control-plane.
I also inspected the K8s API Server certificate and discovered all SANs included which as follows:
As far as I can tell, in our cluster spec nothing ever references the api server by the ELB name. Kops kubecfg uses the DNS CNAME.
What is the purpose of including the ELB name as a SAN?
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest. You may want to remove your cluster name and other sensitive information.8. Please run the commands with most verbose logging by adding the
-v 10
flag. Paste the logs into this report, or in a gist and provide the gist link here.N/A - logs can be provided if the discussion warrants it. There were no command failures.
9. Anything else do we need to know?
The purpose of this issue is to seek clarification on how to migrate to NLB with gossip clusters that have the least amount of impact to cluster operations.
Based on the above information I re-ran migration steps using a modified approach that more closely aligned with what I expected to happen given my assumption that kops does not actually connect to the ELB name, actually rotating the K8s API server certificate would not be necessary.
spec.api.loadBalancer.class: Network.
.kops update cluster --yes
kops export kubecfg --admin
On a hunch, I ran
kops rolling-update cluster
and discovered that the control-plane instance groups were all tagged asNeedsUpdate
. I decided to roll them and after inspected the K8s API certificate and the SAN was replaced with the NLB amazon DNS name.Questions
Based on the feedback this issue gets, I am willing to contribute updated documentation.