replicatedhq / kots

KOTS provides the framework, tools and integrations that enable the delivery and management of 3rd-party Kubernetes applications, a.k.a. Kubernetes Off-The-Shelf (KOTS) Software.
https://kots.io
Apache License 2.0
897 stars 88 forks source link

Application Status shows Ready while upgrade hasn't completed #885

Open MikaelSmith opened 4 years ago

MikaelSmith commented 4 years ago

When I upgrade an application, the Status never changes from Ready. Since there's always a valid running instance, this is kind of true. But when an upgrade fails, for example the pod keeps erroring:

NAME                         READY   STATUS    RESTARTS   AGE
pod/cd4pe-57856bd58c-dbfdn   0/1     Running   308        22h
pod/cd4pe-7859f9d75d-n272d   1/1     Running   2          43h

there's no sign that the upgrade hasn't completed in the console.

I know there's a Degraded status, I'm confused why that isn't shown during upgrade. But ideally I'd like some representation that an upgrade is in-progress if it hasn't finished rolling out changes.

emosbaugh commented 4 years ago

I created #886 to highlight the issue. We may want to think about whether this is the desired status since the app should still be functional or if we should add another state.

marccampbell commented 4 years ago

I think we need an upgrading or other state to indicate this. When an application is being upgraded and there are some pods updated and some not, the application is not necessarily degraded and this feels like a state that could be confusing to end customers.

In this case, the failed upgrade was not surfaced at all in the UI. I think we need to do this, but I wonder if it's really degraded, or normally a transient state. In your example here, it's definitely a deeper problem because the upgrade appears to have failed.

What if we set the state to a new string, such as "updating" when upgradedreplicas < readyreplicas. But we have a timeout (probably configurable) where the application falls back to an error state if the all ready replicas are not on the correct version?

emosbaugh commented 4 years ago

@MikaelSmith are you using a pod status informer or something controlling the pod like a deployment? if the latter what is shown if you get the deployment? i would assume its just not up-to-date, correct?

$ kubectl -n namespace2 t deploy nginx-test
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
nginx-test   3/2     1           3           26m
MikaelSmith commented 4 years ago

The same upgrading vs degraded comes up during the initial rollout. I've had some feedback that "degraded" when it's just doing the initial startup seems misleading.

MikaelSmith commented 4 years ago

I'm not sure what that command is, kubectl t deploy <deployment> doesn't work for me. I guess it's kubectl get deployment <deployment>, I'll have to setup something with an update but it probably is what you showed.

emosbaugh commented 4 years ago

sorry that was supposed to be kubectl get deploy

MikaelSmith commented 4 years ago

Yeah

NAME    READY   UP-TO-DATE   AVAILABLE   AGE
cd4pe   3/3     1            3           18d

shows 1 up-to-date rather than 3. It actually shows 1 up-to-date even though it's not ready, which is interesting.