concourse / concourse

Concourse is a container-based continuous thing-doer written in Go.
https://concourse-ci.org
Apache License 2.0
7.37k stars 846 forks source link

Limit resource usage and runtime of tasks for a specific team #1284

Open aahlenst opened 7 years ago

aahlenst commented 7 years ago

Feature Request

What challenge are you facing?

I'm considering moving a number of teams from dedicated Jenkins instances (each team has its own VM with Jenkins running on it) onto Concourse. From experience I know that sometimes tasks aren't well behaving (build steps are stuck in an infinite loop, …) or the jobs of some teams are too complex and therefore consume too many or even all the available resources. A manual intervention is required to fix the cause or to ask the responsible team to fix its build (without any guarantee that they'll actually do it). This is not really a problem with dedicated Jenkins instances because the people just cause trouble for themselves and the admins. But with a shared Concourse instance all users would be affected by such an incident.

A Modest Proposal

As an admin I would like to be able to set limits on the resource usage (CPU, memory and/or amount of running agents) and runtime of tasks (or "plans" or pipelines, whatever fits best with Concourse's architecture) on teams because this would prevent pipelines from inadvertently consuming all the available resources.

I imagine it could look something like that:

$ fly set-team -n <team_name> --max-pipeline-runtime=3600s --max-containers=4 --max-cores=2 --max-memory=4G

This command would limit the runtime of all the team's pipelines to 1 hour. The team can use at most 4 containers simultaneously and each container can at most use 2 CPU cores and 4 GB of RAM. Some of these limits already exist on the pipeline level (e.g. http://concourse.ci/timeout-step.html) but I cannot force all the teams on our Concourse instance to actually apply them.

andrewedstrom commented 7 years ago

needs #787