bootstrap token can be expired even before infra exists

rudoi commented 5 years ago

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

Scale MachineDeployment to 12
Get first machine / node successfully
Subsequent Machines stuck in pending
SSH to host and see unauthorized from kubeadm join

What did you expect to happen:

Successful kubeadm join.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

This problem is mitigated when the concurrency flags are set high, but we may want to have a mechanism for extending the life of a token while machines are still provisioning.

When the templates are duplicated by the MachineSet controller, basically you get all the KubeadmConfigs instantly (meaning the token is already created and ticking away), but if Machine reconciliation isn't fast enough (EC2 slow or concurrency set to 1, etc), many of those tokens could be expired by the time the nodes are attempting to join.

cc @sethp-nr

Environment:

CAPI / CAPA / CABPK

Cluster-api version: 0.2.1
Minikube/KIND version: NA
Kubernetes version: (use kubectl version): 1.15.3
OS (e.g. from /etc/os-release): ubuntu 18.04

detiber commented 5 years ago

/priority important-soon /milestone v0.4.x

k8s-ci-robot commented 5 years ago

@detiber: The provided milestone is not valid for this repository. Milestones in this repository: [Next, v0.1.x, v0.2.x]

Use /milestone clear to clear the milestone.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/1433#issuecomment-534213042): >/priority important-soon >/milestone v0.4.x Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

detiber commented 5 years ago

/milestone v0.2.x

ncdc commented 5 years ago

Some thoughts on options:

Delay cloning the bootstrap and infra templates until the machine reconciler is ready to work on that machine (this would mean moving the cloning from machineset reconciler to machine reconciler)
Increase the lifespan of the bootstrap token (by default and/or make it configurable)
Increase parallelism, as you indicated

sethp-nr commented 5 years ago

In CAPA it seems that deletes are blocking (it looked like it didn't reconcile any other AWSMachines while the delete was ongoing) which meant we ran into this often on replicas: 12 maxSurge: 100%-style rolling deployments. Turning up the concurrency has largely worked around this case, though.

What's proving a little stickier is AWS-side limits, etc., that have indefinite resolution timelines: if I ask for more m5.4xlarges than my quota says I can have, there's currently a 10 minute clock for me to 1) notice, 2) file the limit increase request, and 3) have the limit increase approved and applied to my account. When that window closes, I manually "retry" the machine by deleting it and letting the deployment re-create it.

sethp-nr commented 5 years ago

@ncdc IMO it would be swell if capbk could notice the machine was still provisioning and keep pushing out the token expiry timestamp; I think that's the only option where I'm not racing AWS quotas against a clock somewhere.

ncdc commented 5 years ago

@sethp-nr that's definitely an option too

detiber commented 5 years ago

@ncdc IMO it would be swell if capbk could notice the machine was still provisioning and keep pushing out the token expiry timestamp; I think that's the only option where I'm not racing AWS quotas against a clock somewhere.

I definitely like the idea of this approach as well, since it helps reduce the edge cases.

But I definitely think we could benefit from:

Delay cloning the bootstrap and infra templates until the machine reconciler is ready to work on that machine (this would mean moving the cloning from machineset reconciler to machine reconciler)

vincepri commented 5 years ago

Pushing out the token expiration in CABPK seems like a good approach to me, it definitely sounds less of an invading change than moving the cloning logic from one controller to another

detiber commented 5 years ago

I think the combination approach of both is probably best. From a security standpoint we want to limit the lifetime of the bootstrap tokens as much as possible.

sethp-nr commented 5 years ago

I'm not sure I understand the "cloning delay" option – in this case, the long pole is the infrastructure provider. How would the Machine controller be signaled by the AWSMachine controller that it was ready to work?

ncdc commented 5 years ago

@sethp-nr it's less that the capi machine controller gets signaled by the infra machine controller and more that the capi machine controller clones from the bootstrap and infra templates when it sees a machine "in this state". The state being "machine came from a machine set and the templates haven't been cloned yet". We'd need to work out the exact details, but this is the hand-waving version 😄

sethp-nr commented 5 years ago

Oh, I think I understand. Right now the MachineSet controller is generating the bootstrap configs "all at once" even if the Machine controller is bogged down somewhere and can't get to them.

That said, I'm not sure if it helps my case very much – the Machine controller is fine, I think, so I would expect it to reconcile the bootstrap config at about the same time the MachineSet currently does. I'm pretty sure I saw the Machine go through lots of reconcile cycles looking to set a NodeRef before the AWSMachine got its "quota exceeded" error.

We're getting bit by this with some frequency ("rapid iteration" yay) so I was planning to pick up attempting to refresh the token TTL – it sounds like that's at least a part of the desired state.

sethp-nr commented 5 years ago

/assign /lifecycle active

ncdc commented 5 years ago

Should I transfer this to the CABPK repo in that case?

sethp-nr commented 5 years ago

Oh, sure – or I could file an issue over there to assign/mark as active if we want to leave the discussion on template cloning open.

ncdc commented 5 years ago

I'll move this one and we can open a new one here if we want to discuss moving cloning around.

chuckha commented 5 years ago

/priority important-soon /milestone v0.1.x

chuckha commented 5 years ago

/lifecycle active

kubernetes-retired / cluster-api-bootstrap-provider-kubeadm

bootstrap token can be expired even before infra exists #248