kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.58k stars 1.32k forks source link

KCP reporting stale information #2771

Closed yastij closed 4 years ago

yastij commented 4 years ago

What steps did you take and what happened:

when deploying a kcp with 3 replicas, the controller do reconcile everything, generates the machines etc.. but after doing

kubectl get kcp

the output is the following

NAME            READY   INITIALIZED   REPLICAS   READY REPLICAS   UPDATED REPLICAS   UNAVAILABLE REPLICAS 
fabio-scale-1   true    true          3          2                3                  1 
fabio-scale-2   true    true          3          2                3                  1 
fabio-scale-3   true    true          3          3                3                   
fabio-scale-4   true    true          3          2                3                  1 

and is going to get stuck in this state forever or a long time

Another problem is that when creating a KCP that fails for some reason we have the following output:

NAMESPACE    NAME                              READY   INITIALIZED   REPLICAS   READY REPLICAS   UPDATED REPLICAS   UNAVAILABLE REPLICAS
default      test-1                            true    true          2          1                2                  1

there is a couple of issues with this:

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

/kind bug

detiber commented 4 years ago

KCP is marked as ready, this leads me to think that my controlplane did finish bootstrapping and is ready as it's marked

Ready is meant to indicate that the control plane API server is available and ready to receive requests. It is meant to be a higher fidelity version of what we are currently using Cluster.Status.ControlPlaneInitialized for to unblock actions against the API Server (i.e. allowing worker Machines to be unblocked for creation).

If we change the behavior of Ready, we will need consider what to use to unblock actions against once we stop supporting individual Machine-based management of Control Plane instances w/o a Control Plane provider.

wfernandes commented 4 years ago

I noticed the stale information as well.

  1. I created a single master/single worker cluster on AWS.
  2. In the KubeadmControlPlane, I edited etcd.Local.ImageTag and noticed the cluster upgrading.
  3. After the upgrade was finished, I saw the following.
    $ kubectl get machines -A
    NAMESPACE   NAME                            PROVIDERID                    PHASE
    default     wff-test-control-plane-nzvvz    aws:////i-stuff4b2   Running
    default     wff-test-md-0-f5b46f98b-wfpqt   aws:////i-stuffa88   Running
    $ kubectl get kubeadmcontrolplanes
    NAME                     READY   INITIALIZED   REPLICAS   READY REPLICAS   UPDATED REPLICAS   UNAVAILABLE REPLICAS
    wff-test-control-plane           true          1                           1                  1

I was expecting to see nothing in UNAVAILABLE REPLICAS and even after a while it remains the same without being updated.

vincepri commented 4 years ago

We're revisiting these concepts with conditions. While the current ready behavior won't change, I think this can probably be closed.

What do you think?

vincepri commented 4 years ago

Closing this for now given that we're now subscribing to the workload's cluster node events. There is some improvements that can be done separately, although they're going to be tracked in a new issue @detiber is going to open.

/close

k8s-ci-robot commented 4 years ago

@vincepri: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/2771#issuecomment-667188641): >Closing this for now given that we're now subscribing to the workload's cluster node events. There is some improvements that can be done separately, although they're going to be tracked in a new issue @detiber is going to open. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.