kubernetes-retired / cluster-api-bootstrap-provider-kubeadm

LEGACY REPO. NEW CODE IS https://github.com/kubernetes-sigs/cluster-api/tree/master/bootstrap/kubeadm
Apache License 2.0
62 stars 67 forks source link

Deleting first controlplane Machine deletes cert Secrets #274

Closed rudoi closed 4 years ago

rudoi commented 4 years ago

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

What did you expect to happen: A flawless upgrade, obviously!

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] I'm fairly certain this is because the cert Secrets have the first controlplane's KubeadmConfig as an owner reference. Deleting the first controlplane Machine deleted the KubeadmConfig, which in turn deleted the Secrets, which in turn made it so the 2nd new controlplane Machine had nothing to look up. This is a guess though.

/assign /lifecycle active

rudoi commented 4 years ago

cc @amy

detiber commented 4 years ago

I'm fairly certain this is because the cert Secrets have the first controlplane's KubeadmConfig as an owner reference. Deleting the first controlplane Machine deleted the KubeadmConfig, which in turn deleted the Secrets, which in turn made it so the 2nd new controlplane Machine had nothing to look up. This is a guess though.

Yes, this would likely be the issue... The Secrets should likely be owned by the Cluster rather than the KubeadmConfig.

detiber commented 4 years ago

/priority critical-urgent /milestone v0.1.x

rudoi commented 4 years ago

@detiber excellent, that's what I was thinking. I'll PR it.

chuckha commented 4 years ago

@rudoi @amy and I worked on this issue on friday. Please use her new branch and this will not happen. The owner refs get modified to be owned by the new kubeconfig.

chuckha commented 4 years ago

the ownership gets complicated if CABPK is creating stuff for a machine for a cluster that is then owned by the cluster. The certs are generated by data from the config not from the cluster. This means if you want to keep a cluster around but scale down to 0 control planes without deleting the cluster, you can end up with stale certificates that require manual intervention. If it's owned by the kubeadmconfig then when scaled to 0 the secrets go away.

edit: that should read the certs are generated from the config and the cluster, not just the cluster

ncdc commented 4 years ago

What part can become stale? Aren't we generating certs with 10 year lifespans?

chuckha commented 4 years ago

the serving cert SANs

vincepri commented 4 years ago

Given that the linked PR has been closed, is there anyone working on a fix for this issue?

ncdc commented 4 years ago

I think we should close this. If we need to, we can open a separate issue to debate if the KubeadmConfig or the Cluster should own the Secrets. I think, however, that I agree with Chuck that the way it's implemented with KubeadmConfig as the owner is correct.

chuckha commented 4 years ago

/close

k8s-ci-robot commented 4 years ago

@chuckha: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-bootstrap-provider-kubeadm/issues/274#issuecomment-539631736): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.