rundeck-plugins / kubernetes

51 stars 59 forks source link

Add pod counter to provide index attribute for pods in a ReplicaSet #131

Open kdebisschop opened 2 years ago

kdebisschop commented 2 years ago

Often I want to run a command only once in a Deployment/ReplicaSet. In effect, I would often like the nodes to be the ReplicaSet rather than the pod. However, not always. And it seems wasteful to duplicate a lot of code to offer the choice of defining nodes as the ReplicaSet or the Pod, as needed.

This patch provides an effective workaround that may be not as formally correct but turns out to be quite practical...it determines the parent of a pod (usually a ReplicaSet, but it does not need to be) and creates an attribute that is the "index" of the pod within its parent. Then, if you want to run a command only once in a replica set, simply add the attribute "index: 1" to the node selection filter.

JordyBottelier commented 2 years ago

Cool idea! I am no rundeck expert so I'm having a hard time seeing if this works, but I assume you tested it?

The comments that you added are a bit hard to understand, perhaps it's a good idea to add an example of how a replicaset-pod would look like? for example:

parent: pod-xxxxxx- replicaset: xxxxxxx

kdebisschop commented 2 years ago

@JordyBottelier You ask if I have tested -- yes, I am using it now.

Thanks for the feedback on commenting -- I will see if I can clarify.

JordyBottelier commented 2 years ago

Thanks for the additional explanation in the comments, it's all clear to me now :). I'd merge it if I had the proper access rights.

ltamaster commented 2 years ago

hi @kdebisschop

I think you can reach the same using an Orchestrator and using a filter by "label" in the job. For example:

Screen Shot 2022-08-25 at 11 12 31

Screen Shot 2022-08-25 at 11 12 47

kdebisschop commented 2 years ago

@ltamaster i don't see how the random orchestrator solves my concern. It allows me to apply to the same number of pods as I have deployments. But, as far as I can tell, it does not give me a way to ensure that it selects one pod from each deployment.

Let's say I have 5 deployments, each with one service and each with anywhere from 1 to 10 pods running in the service (based on load and the deployment of additional pods in the ReplicaSet by the Horizontal Autoscaler). I want to run a script on exactly one pod in each deployment. I can select 5 random pods with the orchestrator, but that won't assure me that exactly one comes from each deployment. By adding index (within the ReplicaSet) as a label, I can filter by label to select only one of the pods within the Deployment.

I have this:

Deployments:

I want to select one pod from each deployment and [ Pod-1.1, Pod-2.1, Pod-3.1, Pod 4.1, Pod-5.1 ] meets that criterion. Selecting 5 random pods could give me [ Pod-3.1, Pod-3.2, Pod-3.3, Pod 5.1, Pod-5.2 ], missing the majority of the deployments.

If the node/pod labels included the parent deployment ID, it might be possible to write a new more generic orchestrator that would select one pod from each parent group. Including the parent Deployment ID would still be a code change in the k8s plugin. But to my knowledge, there is no orchestrator that will do that selection even if parent is available as a pod label. (Actually, in my case there is only one deployment of a given service in a namespace. So, if there was an orchestrator that could select one random node per namespace, I could use it. But that is a special case. In general, a namespace certainly could have more than one deployment of a given service in it)

ltamaster commented 2 years ago

HI, @kdebisschop I will double-check it. I was thinking of a single deployment filter. Thanks for the explanation

kdebisschop commented 1 year ago

I would love to bubble this back up. It is critical functionality for us and I have confirmed that the random selector in general runs the same command in some deployments multiple times and entirely skips other deployments. Further, as the number of nodes changes when new deployments are added, you need to manually change the number of items selected. The random selector is in no way an answer.

If nodes were based on deployments rather than pods, we would not need something like this...but basing on deployments limits our ability to target specific pods, which is not ideal. And probably breaks compatibility. So adding an index parameter seems like a pretty viable option. It has been working flawlessly for us for the last year.

I have rebased my work against master...the build fails because the github build scripts are out of date in that Python 2.x is no longer supported. I have submitted a separate PR to address that issue. If that PR is accepted, I can rebase again and I expect the tests will pass.