kubernetes-retired / cluster-api-bootstrap-provider-kubeadm

LEGACY REPO. NEW CODE IS https://github.com/kubernetes-sigs/cluster-api/tree/master/bootstrap/kubeadm
Apache License 2.0
62 stars 67 forks source link

🐛 Fix init lock #232

Closed accepting closed 4 years ago

accepting commented 4 years ago

Signed-off-by: JunLi lijun.git@gmail.com

What this PR does / why we need it: Fix bug for the init lock.

Consider the following case: Two control plane machines(note them as C1, C2) are creating concurrently. Assume C1 first acquires the init lock. Since C2 failed to get lock, it exits the reconcile loop and releases the init lock. Before C1 finished to initialize itself, we create another control plane machine(note it as C3). Obviously, C3 will acquire the init lock successfully and also generate controlplane-init config which is what we do not want to see.

k8s-ci-robot commented 4 years ago

Welcome @accepting!

It looks like this is your first PR to kubernetes-sigs/cluster-api-bootstrap-provider-kubeadm 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-bootstrap-provider-kubeadm has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot commented 4 years ago

Hi @accepting. Thanks for your PR.

I'm waiting for a kubernetes-sigs or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
chuckha commented 4 years ago

The second control plane can't release the lock that the first control plane is holding. If it does release the lock that is a bug.

vincepri commented 4 years ago

Reading this code https://github.com/kubernetes-sigs/cluster-api-bootstrap-provider-kubeadm/blob/master/internal/locking/control_plane_init_mutex.go#L100-L126 it seems like we release the lock without checking if the Machine that has the lock is the same as the current one?

vincepri commented 4 years ago

Which is something we do here https://github.com/kubernetes-sigs/cluster-api-bootstrap-provider-kubeadm/blob/master/internal/locking/control_plane_init_mutex.go#L65-L73

chuckha commented 4 years ago

@vincepri !!! yeah that would be a bug 😞

accepting commented 4 years ago

@vincepri @chuckha I don't think it's a good idea that only the owner machine can release the lock. As far as I know, the lock should be hold from the first acquiring to the time when cluster.Status.ControlPlaneInitialized turns to be true. Therefore, the lock owner maybe have no chance to release it.

chuckha commented 4 years ago

In the easy case when there is simply one control plane machine:

  1. The control plane machine acquires the lock
  2. the control plane machine does not error out and sets bootstrap data. the lock is held
  3. some time passes and the control plane becomes initialized at which time the kubeadm will be reconciled again (due to the cluster watch)
  4. the control plane machine returns early because it has bootstrap data set.

Now a new bootstrap config is created for a worker machine.

  1. the control plane is initialized so the the lock is released. Not that it matters at this point.

Let's say there are two control planes and two managers. This situation should never happen since the deployment uses leader election to ensure only a single manager is running. But let's say this does happen.

The case you originally outlined indeed holds true. C1 acquires the lock. C2 attempts to acquire the lock and cannot and errors out and thus removes the lock. C3 is created and acquires the lock before C1 finishes initializing. Now you have two clusters. Very bad.

What if we never released the lock in the successful case?

I think one clean way to do this is to acquire the lock and never unlock it. The only time we want to unlock the lock is if we hit an error condition after the lock is created but before the bootstrap data is generated.

This simplifies the code a bit. This change is an improvement but we should still delete the Unlock code on line 265.

Interested to hear your thoughts @accepting

accepting commented 4 years ago

@chuckha In my opinion, the exact place to release the lock is where cluster.Status.ControlPlaneInitialized is set to true in the cluster controller. If we can not make the cluster controller do the work, the Unlock code on line 265 may be the most right place in CABPK.

The only bad effect I can see is that the Unlock code on line 265 will be involved every time a new bootstrap config is created afterwards. That is really unnecessary.

If we never released the lock in the successful case, it can avoid the bad effect above. But It may lead some bad effects. For example,

  1. Create a cluster with only one control plane machine, note it as C1.
  2. C1 acquires the init lock successfully and hold it forever. That means the configmap {cluser-namespace}/{cluster-name}-lock always exist. And the configmap contains C1's machine name.
  3. After some days, we realize some problems in this cluster and want to rebuild this cluster. We delete the cluster and all machines in this cluster.
  4. We recreate the cluster with the same cluster name and with different control planes.
  5. Create control plane NC1 and its bootstrap config. NC1 tries to acquire the init lock. At this time, since the configmap {cluser-namespace}/{cluster-name}-lock exists, and the information in the configmap is C1's machine name which not equals to NC1's machine name, NC1 fails to acquire the init lock. Hence, NC1 can not generate bootstrap data forever.
chuckha commented 4 years ago

@accepting the config map has an owner reference so it will be deleted when we delete the owning machine

chuckha commented 4 years ago

another way this would work is:

If the bootstrap config already has bootstrap data and the bootstrap config is holding the lock and the cluster.status.controlplaneinitialized is true, we can then delete the lock at that point

accepting commented 4 years ago

@chuckha You are right, I missed that. But the owner reference is set as the cluster now. We may need to change it to the related machine.

k8s-ci-robot commented 4 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: accepting, chuckha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/cluster-api-bootstrap-provider-kubeadm/blob/master/OWNERS)~~ [chuckha] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment