jenkinsci / nomad-plugin

Nomad cloud plugin for Jenkins
https://plugins.jenkins.io/nomad/
MIT License
56 stars 41 forks source link

Limit number of instances per slave #25

Open zstyblik opened 7 years ago

zstyblik commented 7 years ago

Hello,

I'm not sure whether it's possible, ideally in nomad itself, but it'd be great if one could limit maximum number of instances for one particular slave node. Maybe this request is simply ... out of your mind ... in nomad world. I know there are resource constrains, but:

To solve this in the code, one would have to query nomad/jenkins, I guess, and find out how many slaves with label X are already up and then ... I don't know. I'll try to give a look how docker plugin does it.

Thank you.

EDIT: I presume resource allocations/constrains are per slave instance, not for all, group of, instances of particular slave.

multani commented 7 years ago

That's already handled by Nomad and the resources requirements you specify in the job, isn't it?

The only thing which is handled not so nicely right now is jobs with different resources requirements: it requires to create additional "template" in the master's configuration, which is a bit a PITA. That's something that I raised already in #20 though.

zstyblik commented 7 years ago

I'm sorry, but I'm not sure if I follow(also, what master's template?).

If I spawn 20 CI instances, label ci where each requires 20GB of RAM and these instances eat all resources, or say 5GB of RAM is left in the cluster. Now comes batch job, label batch, with requirement 10GB of RAM, where is going to be executed? I presume it won't and will have to wait. However, if it was possible to limit number of instances of ci to 10, then:

  1. batch slave could be executed/spawned and wouldn't have to wait
  2. 10 ci instances would have to wait in queue which is too bad, but that's life. :)

I hope this clarifies my point. I'm referencing Instance Cap from Docker Plugin.

Just to reiterate. Right now, you can spawn as many instances of particular slave as you like. And I'm sorry, it doesn't make sense to me. Yes, resource planning and capping is really great for production, but not for Jenkins. Job doesn't know how many slaves it can or cannot spin up. It just will try. As a result, resource exhaustion may happen and your jobs will keep on waiting. Yes yes, you can assign priority which is great I'm sure, but I don't see how it solves problem. I want to cap number of instances.

EDIT: additionally to resources which, as you've pointed out, is managed by nomad. Just not at the level which makes sense to me related to Jenkins, Jenkins slave nodes and jobs.

jovandeginste commented 6 years ago

@zstyblik I'm not sure it is possible to limit the number of instances per slave (host). I expect a global maximum number of instances would be the easiest to implement. If you limit the number of jobs per host on Nomad in general, you just move the goal posts?

zstyblik commented 6 years ago

I expect a global maximum number of instances would be the easiest to implement.

This is pretty much implemented by (limit of) resources in nomad cluster. I hope it's not possible to over-provision.

If you limit the number of jobs per host on Nomad in general

Despite docker plugin does allow you to do that at two levels(number of running containers and then number of instances of given slave), this is not what I would like to have(sorry, I cannot think of better words).

Imagine the following:

Now, when CI is triggered by a commit, then CI runs in parallel on each and every branch there is, because a) CI should be fast and we want results back fast b) to ensure that all branches can be merged into master without conflicts or whatever. Python project has 8 branches, Ruby has 6 branches. Python devs commits and pushes into his feature branch which triggers 8 CI jobs. 8GB of RAM from the cluster is allocated. Couple seconds later, Ruby dev does the same. Following should happen/be observable:

In other words, Python dev team has taken the most resources. Were it possible to limit number of slave instances, we should end up with(desired state):

I hope the example is not too stupid to show what my point is.

Thanks.

jovandeginste commented 6 years ago

That is what I mean, you could implement a sort of "global" maximum of running slaves (per template). Should not be too hard to implement I think.

zstyblik commented 6 years ago

I see that my example was overly simplified (and stupid) in the end. The problem I have is there are Jenkins jobs which are either resource hungry or resource hungry and long running(pretty much true batch jobs). It's ok for them to be queued, because either it's not financially viable to have resources for these jobs at the ready all the time, therefore it's ok to wait a bit.

Kamilcuk commented 1 year ago

That's already handled by Nomad

While this is handled by Nomad, it results in a looooooong list of offline clients that do nothing, are offline, and then timeout. This list is confusing. Would be great to set a limit.

image

multani commented 1 year ago

This should be fixed by #196

Kamilcuk commented 1 year ago

Och, that is amazing. I did not notice it, thank you!

multani commented 1 year ago

Feel free to test the branch and report any issues or suggestions there, that would be really appreciated! :)