Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
This PR is auto-generated from #24304 to be assessed for backporting due to the inclusion of the label backport/1.9.x.
The below text is copied from the body of the original PR.
In our production environment where we run Nomad on v1.8.2 we noticed overlapping cpusets and the Nomad reserve/share slices being out of sync. Specifically, the below setup where we have various task in prestart and poststart that are part of the main lifecycle.
I managed to reproduce it with the below job spec on the latest main (v1.9.1) in my sandbox environment :
Fixes a bug in the BinPackIterator.Next method, where the scheduler would only
take into account the cpusets of the tasks in the largest lifecycle. This could
result in overlapping cgroup cpusets. By using the Allocation.ReservedCores, the
scheduler will use the same cpuset view as Partition.Reserve. Added logging in
case of future regressions thus not requiring manual inspection of cgroup files.
Overview of commits
- 997da25cdb49c634749be97874955024492b9d43
Backport
This PR is auto-generated from #24304 to be assessed for backporting due to the inclusion of the label backport/1.9.x.
The below text is copied from the body of the original PR.
In our production environment where we run Nomad on
v1.8.2
we noticed overlapping cpusets and the Nomad reserve/share slices being out of sync. Specifically, the below setup where we have various task inprestart
andpoststart
that are part of themain
lifecycle.I managed to reproduce it with the below job spec on the latest main (v1.9.1) in my sandbox environment :
Spinning up two jobs with this spec resulted in the following overlap :
Full output
Overview of commits
- 997da25cdb49c634749be97874955024492b9d43