Take a maximum number of runners per provider into account

Just an idea that came to our mind and I wanted to share. Consider the following:

Have a host that is capable of running 20 runners in parallel
Have 2 orgs (A and B) registered

The problem is, there's now no way to specify that we are able to run 20 runners maximum. So when 10 runners are assigned to each org, all is good. On the other hand, when org A is rarely used, org B is limited to 10 runners, although most of the time more runners could be used, because resources are available. If I am using autoscaling and set org A to max 10 runners and org B to max 20 runners, this will work fine, as long as org A has no runners active. But when org A ramps up, I end up with 30 runners, which will be a problem for the system.

What I was thinking as a rough idea:

Add a "max number of runners" setting to the provider to determine how many runners can be run at maximum for each provider
Add a "pool priority" setting to the pool to determine which pool should be preferred when autoscaling and we are reaching the maximum of the provider

So given my example above, this would:

Have org A with prio 10 (higher) and org B with prio 0 (lower)
Have org B with 19 runners and org A with 1 runner (when 1 is defined as the minimum for org A)
Scale back org B to free resources for org A when org A scales up (so when org A has 5 runners, org B should only have 15)
So when both orgs scaled to their maximum, we would end up with 10 runners for org A and 10 runners for org B

Hi @SystemKeeper !

Sorry for the late reply.

For the most part, this sounds like a good idea. Especially for providers that deal with limited resources, like LXD, Incus, K8s and potentially future providers for various other systems.

The current architecture of GARM is a really simple one. It's a single process, single server app. It doesn't currently scale horizontally, at all. It could with some refactoring, but up until this point it wasn't really needed. At least not for performance reasons. However, there are plans to split it up in the future into multiple components:

API (user facing - talks to controller)
Controller (deals with database writes - talks to API and Scheduler)
Scheduler (decides what goes where - picks a proper worker)
Workers (does the actual work, this would be where the provider executables reside)

At that point, I think we could start thinking about something along the lines of what you described. It wouldn't be impossible to have something like you described in the current code, but it would be difficult to add it in a way that would not make the code more difficult to decouple in the future. When we have a proper scheduler component, we can develop that further and potentially implement "filters" similar as a concept to what OpenStack has. A request could be passed through the scheduler which in turn would weigh that request using various filters that could be enabled in the scheduler and decide if a worker is returned to take care of the task, or if the request would be throttled for later re-queuing.

All of that, however is dramatically more complex than what currently exists in GARM, and is a large effort that will see us (probably) moving away from sqlite (even though there are some interesting projects out there that could help us stay with sqlite, but that would probably be like jamming a triangle in a square)

I will keep this open (potentially for a long time), but this is something that I acknowledge is useful in some cases.

cloudbase / garm

Take a maximum number of runners per provider into account #204