kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
111.49k stars 39.77k forks source link

Refactor prober_manager to update the runtime state vs. API state #115937

Open SergeyKanzhelev opened 1 year ago

SergeyKanzhelev commented 1 year ago

In sidecar container KEP we plan to use the Startup probes state as a way to decide whether the next Init container need to start being initialized. Before this KEP probes results were not used in any of Pod lifecycle decisions. In order to implement this we need to have a probe status at a time we call findNextInitContainerToRun. This may need some refactoring on how prober_manager is initialized and used.

Today we calculate changes needed to be performed on Pod including the next init container to run. Then we apply these actions and update the Status of a Pod on API server using convertStatusToAPIStatus from generateAPIPodStatus. Once we converted the runtime status to API status, we also call kl.probeManager.UpdatePodStatus(pod.UID, s) passing the API status object to be updated with the probes results. Thus the probe results are not visible to the findNextInitContainerToRun function.

The proposal is to refactor the prober_manager.go to operate on kubecontainer.PodStatus instead of v1.PodStatus. This will require adding the probes status to the kubecontainer.PodStatus. We will also move the call into the prober_manager.go early in the syncPod execution so we know the startup probes result before we call into findNextInitContainerToRun.

cc @smarterclayton

/sig node /kind feature /priority important-soon

SergeyKanzhelev commented 1 year ago

/triage accepted /assign @matthyx

smarterclayton commented 1 year ago

On the surface this seems reasonable:

  1. PodStatus is purely synthetic - it's source of truth from runtime, aggregate, and not durable
  2. v1.Pod has a mix of truth from different sources - it's easy for callers to get confused about which field in v1.Pod is authoritative at any time (i.e. the phase of the pod from the API is different from the phase of the pod the kubelet is aware of, since both are sources of truth but the kubelet has the normal owner), and so it's better not to mix more things in

A couple of points:

gjkim42 commented 1 year ago

Today we calculate changes needed to be performed on Pod including the next init container to run. Then we apply these actions and update the Status of a Pod on API server using convertStatusToAPIStatus from generateAPIPodStatus. Once we converted the runtime status to API status, we also call kl.probeManager.UpdatePodStatus(pod.UID, s) passing the API status object to be updated with the probes results. Thus the probe results are not visible to the findNextInitContainerToRun function.

After digging into this code area, I found that findNextInitContainersToRun is called by (*kubeGenericRuntimeManager).computePodActions and *kubeGenericRuntimeManager has its own probe managers. (It already has been using those probe managers in computePodActions.)

https://github.com/kubernetes/kubernetes/blob/d48b8167f7094bfae89ab601d2356868ba107627/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L115-L118


The proposal is to refactor the prober_manager.go to operate on kubecontainer.PodStatus instead of v1.PodStatus. This will require adding the probes status to the kubecontainer.PodStatus. We will also move the call into the prober_manager.go early in the syncPod execution so we know the startup probes result before we call into findNextInitContainerToRun.

IMHO, one alternative could be that let kuberuntime.PodStatus be as it is and use those probe managers in findNextInitContainersToRun.

@matthyx @SergeyKanzhelev What do you think?

matthyx commented 1 year ago

What do you think?

it might work, but here I have implemented your other idea https://github.com/SergeyKanzhelev/kubernetes/pull/3

k8s-triage-robot commented 1 year ago

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged. Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

matthyx commented 1 year ago

This one will be worked on after Sidecars are in GA.

k8s-triage-robot commented 10 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

matthyx commented 10 months ago

/remove-lifecycle stale

k8s-triage-robot commented 7 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

vaibhav2107 commented 6 months ago

/remove-lifecycle stale

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

matthyx commented 3 months ago

still waiting for sidecars GA (1.32) /remove-lifecycle stale

k8s-triage-robot commented 3 weeks ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale