Open jdharmon opened 3 years ago
I was wondering if/when this would ever come back to bite me... https://github.com/target/pod-reaper/issues/37 Long story short... "status" as listed by kubectl isn't either pod-phase or reason... it's actually something different... something not addressed directly in the the API from what I saw.
I suspect that Running
isn't actually a "pod status" but actually a "container status". I 100% agree, it's confusing as hell!
// ContainerState holds a possible state of container.
// Only one of its members may be specified.
// If none of them is specified, the default one is ContainerStateWaiting.
type ContainerState struct {
// Details about a waiting container
// +optional
Waiting *ContainerStateWaiting `json:"waiting,omitempty" protobuf:"bytes,1,opt,name=waiting"`
// Details about a running container
// +optional
Running *ContainerStateRunning `json:"running,omitempty" protobuf:"bytes,2,opt,name=running"`
// Details about a terminated container
// +optional
Terminated *ContainerStateTerminated `json:"terminated,omitempty" protobuf:"bytes,3,opt,name=terminated"`
}
There isn't anything that detects "Running" container state right now, and I think changing the "container status" rule would probably would result in behavior that isn't intended (I think changing the already existing rule would result in the pod being flagged for reaping if ANY of the containers in it were "Running" -- like it currently identifies if ANY container in the pod is "terminated" or "waiting"... uhg...
Perhaps the way forward is
POD_STATUS_REASON
POD_STATUS_PHASE
+1 to the suggestion from @jdharmon above.
We spent quite some time understanding why pod reaper was not working when using POD_STATUSES=Succeeded
and POD_STATUSES=Completed
only to realize it was using pod.reason and not pod.phase.
Kubectl shows the pod phase as the status, e.g. Running, Succeeded, Failed. The pod status rule checks the optional reason, e.g. Evicted. This is confusing, and also means you cannot create a
POD_STATUS=Running
rule. Can we change the pod status rule to use phase? Should the current reason rule be renamed/use a different env variable?POD_STATUS_REASON
?Relevant documentation: