Closed kow3ns closed 5 years ago
@kow3ns similarly to : https://github.com/kubernetes-sigs/application/issues/76#issuecomment-435105857, could you capture this in a PR/.md, so it's easier to comment?
Hey @kow3ns Should we abandon #76 and focus on this one ? This is a more fleshed out version of #76 (meaning #76 could be enhanced to become a version of this).
Please consider specifying the behavior for nested Apps. Either explicitly saying it will not have a status as a component or handle the (nasty) implications in the doc.
Admittedly, it is not a core component, but one that the App CRD itself could not pretend not to be aware of :)
I want to propose changes to this issue. They are longer than can reasonably fit in a comment. What is the suggested way to do that? New issue or google doc or PR?
I think google doc is easier to iterate and comment
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
Application Status
Objective
The objective of this proposal is to provide a mechanism to aggregate the status of an Application. We propose a mechanism to compute the readiness, availability, errors, and disruptions associated with an Application. As black and white box health monitoring are a complicated topic that deserves its own treatment, we do not address it in this proposal.
Background
.status
of a resource to communicate information about its readiness. This allows existing CRDs to opt into the scheme without breaking compatibility with existing tooling and to evolve to use fields. Additionally, it provides a mechanism to provide additional, human readable, information with respect to the status of an Application's components.Conditions
Conditions are used across the Kubernetes API surface in order to indicate the condition of a resource as its controllerseeks to realize the declared intent in its specification. They are described by the golang struct below. Throughout this proposal we use conditions in conjunction with fields.
Readiness
From the perspective of Pods, readiness indicates the ability of the Pod to receive network traffic. It also indicates that the resource, from the perspective of the control loops that act on it, is ready for use. We use the same semantics for the components of an Application. Readiness for an Application implies that all of its components are ready. That is, an Application is ready if and only if all of its components are ready. Components that contain no user declared desired state (i.e. have no spec) (e.g. ConfigMaps and Secrets) are always ready post creation.
status.ready=true
or by including a condition like{"type":"Ready","status":"true"}
in the resource'sstatus.conditions
.status.ready=false
or by including a condition like{"type":"Ready","status":"false"}
in the resource'sstatus.conditions
.status.ready=true
and including a condition like{"type":"Ready","status":"false"}
) the value of the field takes precedence.status.ready=true
and including a condition like{"type":"Ready","status":"false","message":"RDBMs is ready for use."}
provides a user readable message about the readiness of the resource).Availability
From the perspective of Deployments, and many out of tree resources, availability, indicates that all Pods in the related to the resource remain ready after some configurable duration. This notion is useful for application components in general as indication that the resource is unlikely to fall victim to infant mortality after creation or mutation. We use availability for an Application's components in this context. As availability is not applicable to all components, it is not aggregated for an Application.
status.available=true
or by including a condition like{"type":"Available","status":"true"}
in the resource'sstatus.conditions
.status.available=false
or by including a condition like{"type":"Available","status":"false"}
in the resource'sstatus.conditions
.status.availabile=true
and including a condition like{"type":"Available","status":"false"}
) the value of the field takes precedence.Observation
Kubernetes control loops communicate that they have observed modifications of the declared desired state contained in a resource by setting its
status.observedGeneration
to itsgeneration
. A resource for which this is true is said to be observed.status.observedGeneration
to the value ofmeta.generantion
to communicate that they have observed the creation of, or a mutation to, a resource they control.Progress
Kubernetes control loop use various methods to communicate that reconciliation between a resources specification and the observed state of the system is progressing. Deployments, and many non-core resources, communicate this using the
Progressing
condition. For an Application, progressing components indicate that the application is updating.status.progress
to a true or to a non-negative 32-bit floating point number between in [0,100] (e.gstatus.progress=true
orstatus.progress=99.9
).{"type":"Progressing","status":"true"}
in the resource'sstatus.conditions
.status.progressing=true
and including a condition like{"type":"Progressing","status":"false"}
) the value of the field takes precedence.Disruptions
Application components may be affected by planned or unplanned disruptions. For instance the destruction of a Node may disrupt many replicated Pod sets. The application controller MAY use other resources, e.g PodDisruptionBudgets, to add this condition to a resource's status, and the controller for a resource MAY communicate this directly by adding such a condition.
{"type":"Disruption","status":"true","reason":"Node unavailable","message":"Auto-scaling in progress"}
.Errors
At any point in their lifetime controller may encounter errors when realizing the declared intent of the user. The are communicating using the
status.conditions
of the resource.{"type":"Error","status":"true","reason":"Controller Wedged","message":"SharedInformer sycn failing."}
.Compatibility Requirements
Resources and controllers that wish to be compatible with the Application Controller status computation need only implement the following.
.spec
contains no declarative intent. It is always ready..status
field..status
field MUST indicate readiness..status
field MAY indicate availability..status
field SHOULD indicate observation by the controller..status
field MAY contain conditions.status.conditions
field.status.conditions
field.Core Resource Adaptations
The core resources do not all conform to the schema above. In the future, we may modify them to do so. For the time being, the following describes how the Application controller will compute the status of these resources.
.spec.replicas
is equal to.status.readyReplicas
is equal to.status.replicas
and all are greater than zero..status.conditions
contains anAvailable
condition.spec.conditions
contains aProgressing
condition..status.observedGeneration
is equal tospec.generation
.Failure
conditions are converted toError
conditions..spec.replicas
is equal to.status.replicas
and.status.readyReplicas
and all are greater than zero..spec.replicas
is equal to.status.replicas
and.status.availableReplicas
and all are greater than zero..spec.replicas
is not equal to.status.replicas
..status.obloadbalancerservedGeneration
is equal tospec.generation
.ReplicaFailure
conditions will be converted toError
conditions..spec.replicas
is equal to.status.replicas
and.status.readyReplicas
and all are greater than zero..status.currentReplicas
is not equal tostatus.updateReplicas
or if.status.replicas
is not equal to.spec.replicas
..status.observedGeneration
is equal tospec.generation
..status.currentNumberScheduled
is equal to.status.desiredNumberScheduled
and.status.numberReady
and all are greater than 0..status.currentNumberScheduled
is equal to.status.desiredNumberScheduled
and.status.numberAvailable
and all are greater than 0.status.numberUpdated
is greater than 0..status.observedGeneration
is equal tospec.generation
.Ready
condition..status.loadbalancer.ingress
list is not empty..status.loadbalancer.ingerss
is not empty.Ready
when its.status.phase
isBound
. This may seem strange as PVC implements aReady
phase, but a PVC is not useful to the application until it is bound, and most errors post creation and during binding.API
This section contains the proposed modifications to the API. Here, we modify the ApplicationStatus type to report the observed status of its components. Each
ComponetStatus
contains a link, resource identifying information, and theComponentConditions
of the components indicated by the Application's.Spec.ComponentKinds
and.Spec.Selector
. The status of the applications components is used to computeApplicationConditions
that apply to the application as a whole.Application Status Computation
The Application controller will periodically list the applications residing on the API Server. For each Application resource the controller will do the following.
spec.assemblyPhase
of the Application is pending the controller will not update the.status
of the Application. This allows application installers time apply all necessary components prior to application status computation..spec.componentKinds
..status.components
of the Application.Progressing
thestatus.updating
field is set to true.status.ready
field is set to true.status.ready
field is set to false.Example