feature request: set task-cpu/memory-limit per worker

concourse / concourse

Concourse is a container-based continuous thing-doer written in Go.

https://concourse-ci.org

Apache License 2.0

7.37k stars 846 forks source link

feature request: set task-cpu/memory-limit per worker #4669

Closed evanchaoli closed 4 years ago

evanchaoli commented 4 years ago

What challenge are you facing?

Now we can only set task-cpu/memory-limit to concourse web, which will apply to all workers.

In our cluster, we have multiple worker pools (identified by team and tag), we want each pool to be configured with different task-cpu/memory-limit.

What would make this better?

Add task-cpu/memory-limit option to concourse worker.

Are you interested in implementing this yourself?

Not sure, as I haven't touched worker related code yet.

jamieklassen commented 4 years ago

Can you talk a little bit more about your use-case? You have workers of different sizes designated for different team/tag combinations - why should the worker my task gets assigned to affect the available cpu/memory? Is that a feature you actually want to provide to your users, like you have some kind of small, medium, large tags on your workers to represent their vertical scale? This is technically already available to them, as they can override the default container limits by specifying them in their task config.

I'm guessing you're approaching this more as an operator looking for a way to strike a balance between keeping your workers from falling over and being able to provide more cpu/memory to tasks that need it. In that case I definitely think that's an important goal (we all know that worker stability is a recurring pain point), but I have some slight concerns about this particular solution - it isn't exactly worker-state, but it's close.

All that said, I don't mean to be draconian and shut down your idea, because it's very intuitive and it's quite possible (in my opinion, especially given the prominence of k8s) that choices like these will become a bigger part of managing workers in the future. Mostly I'm very curious about how you'd intend to make use of this feature?

evanchaoli commented 4 years ago

@pivotal-jamie-klassen Let me simplify our use case:

Basically we don't want users to run arbitrary workloads on our Concourse clusters, if you just run fly execute, then Concourse can be a container runner. So we wan to set relative small task cpu/memory limitation, because we don't want some monster tasks eat up worker resources and impact other normal tasks. However there are some CI pipelines run java maven build, which would consume more memory than the task memory limitation, so we create a worker pool, tagged with something like "large". In this worker pool, workers VMs are configured with more vCPUs and memory. Those maven build task is tagged with "large", so that they run in this pool.

Now we have two worker pools, one is regular, one is "large". In the regular pool, we will set small task cpu/memory limitation; in the "large" pool, we will set bigger limitation.

But given Concourse currently doesn't support to set task limitation on worker, we now cannot set limitation. That's reason why we want this feature.

Back to our real use case, we have more considerations. We host worker VMs on an internal IaaS, the IaaS has a charge back mechanism. Our long term goal is to provide worker pool per team request, and charge back, like "you get what you pay". As user pays, we may allow they decide their own task limitation.

jamieklassen commented 4 years ago

I'll just take a stab at restating your goals; from what I can tell you have two desires:

Limit the impact of "noisy neighbours" - so clearly we're in the realm of https://github.com/concourse/concourse/issues/3695#issuecomment-540910150 (which is exactly what you're requesting here), and this is similar to #4390 but we're talking actual cpu/memory requirements rather than raw container count. I can definitely see the appeal of adding more state to workers to quickly meet this desire, but the underlying problem is more about understanding which tasks are cpu/memory-intensive and knowing which workers can afford to run them - @concourse/runtime feel free to weigh in, but I don't think we really have a better idea at the moment of how to make scheduling smart enough to handle this use-case.
Audit cpu/memory consumption per-pipeline - this is also a pretty natural thing to want from multi-tenant environments, and I've heard people talk about it informally, but I don't know if there's much existing discussion on GitHub @scottietremendous @vito

deniseyu commented 4 years ago

@pivotal-jamie-klassen your explanation of point 1 lines up with my understanding of the world currently!

It's true that number of containers is not a very scientific way to meter usage, nor the most efficient bin-packing method we could use, but we're missing some visibility (and other unknown unknowns, I'm sure) for jumping straight to scheduling that accounts for memory and CPU consumption. I wonder if building resource metering capabilities into Concourse is the right answer though. I admit I don't know enough about vSphere -- does the IaaS have any tools to help with this, like how GCP has quotas?

I kind of worry that introducing finer-grained customizability of workers will start making them more like pets than cattle. I think that one of the design priorities of Concourse has always been that workers should do undifferentiated heavy lifting, and it shouldn't matter if some of them have to be recreated. If using some combination of team workers, tagged workers, and external workers, plus the new container placement strategies is still insufficient, then I think we should do some brainstorming to see if there are good patterns out there to follow, or if there are out-of-band tools that already exist that can help us.

evanchaoli commented 4 years ago

@pivotal-jamie-klassen @deniseyu Thank you guys for the comments. As Concourse allows attach attributes "team" and "tags" to workers, workers with same "team" and "tag" is logically a pool. I think it makes sense to allow each pool to have different container cpu/mem limits. Within a pool, whatever container placement strategies should just work as normal.

Today I checked related code. I assumed a container creation work flow to be atc->worker->garden, but after reading some code, I realized the work flow is actually: atc -> garden server -> garden backend (github.com/vito/houdini). So to implement this feature, changes may include:

Make vito/houdini to support default container cpu/mem limits.
Add CLI options of debug cpu/mem limits to concourse worker.
concourse worker passes default cpu/mem limits to houdini.

After this change, clusters that has set container cpu/memory limits will still work as the current behavior. For our clusters, we'll remove limits from ATC, and configure limits per worker pool.

kcmannem commented 4 years ago

This is a highly controversial topic. I wouldn't want more customized workers, as @deniseyu mentioned, the more we allow uniqueness to workers, the more operators customize a cluster to their workloads. Once a new workload is required, operators would have to intervene and adapt the pools to better suite everyone.

I recommend you comment on https://github.com/concourse/concourse/issues/3695, this is an issue that effects the whole community and we'd like it to be a solution that everyone commits to. Not a decision thats made in isolation.

Thanks for the suggestion!

stale[bot] commented 4 years ago

Beep boop! This issue has been idle for long enough that it's time to check in and see if it's still important.

If it is, what is blocking it? Would anyone be interested in submitting a PR or continuing the discussion to help move things forward?

If no activity is observed within the next week, this issue will be ~~exterminated~~ closed, in accordance with our stale issue process.