Closed mnitchev closed 3 years ago
Hi @taylorsilva! We see this change is not part of v7.3.2
as you release out of release/7.3.x
instead of master
: does this mean we should backport our PR to that branch in order to get it released soon-ish? Thanks!
We're probably gonna release 7.4 soon based on the milestone progress: https://github.com/concourse/concourse/milestone/75
@taylorsilva I don't see this PR in that milestone, you mean you're going to release out of master
?
Sorry, our release process is not clear for this repo. Basically whenever we're about to make a new release of the Concourse binary we'll create a new release/x.x.x
branch on the main repo (concourse/concourse) and all packaging repos as well. We'll branch off the latest commit on master
for all those repos, creating the latest release from that.
Therefore, new releases for this repo are only made when new Concourse releases are made. This does suck whenever there's a packaging-level fix made, we can't cut a new packaging-only release.
This is not the case for the helm chart. It is on its own release schedule.
The bionic stemcell contains systemd v237 which by default gives a limit of 4915 processes to system cgroups. This restriction can cause jobs to fail when under load. So we are reverting to the xenial default of pids.max = max.
We are seeing this pipeline where we have many periodic jobs running at the same time. The limit gets gets exhausted fairly quickly causing random jobs to fail. We have hit this same problem on garden a few months ago. You can find more info on that in this pivotal tracker story and this commit in our bosh release. SSH-ing on the concourse worker we can see that the
garden.system
cgroup haspids.max
set tomax
, but since it is a child cgroup ofconcourse.system
and that has it's pid.max set to the default of 4915, it gets limited too.