Closed ftigeot closed 11 years ago
confirmed. "make -V_MAKE_JOBS" produces:
-j/sbin/sysctl -n hw.ncpu
I'll look for another variable. This may be a legacy.
This is starting to look like user error. Here is the logic: if ALLOW_MAKE_JOBS is defined in poudriere.conf, _MAKE_JOBS will default to hw.ncpu unless JOBS_LIMIT is also defined in poudriere. If ALLOW_MAKE_JOBS is not defined, then _MAKE_JOBS will equal nothing.
My guess is that ALLOW_MAKE_JOBS was defined in poudriere.conf and JOBS_LIMIT wasn't.
"MAKE_JOBS" doesn't come into the equation at all. Nothing looks at it, not even bsd.port.mk. And don't forget, the host make.conf is not used, ever.
On Sun, Feb 10, 2013 at 01:49:22AM -0800, jrmarino wrote:
This is starting to look like user error. Here is the logic: if ALLOW_MAKE_JOBS is defined in poudriere.conf, _MAKE_JOBS will default to hw.ncpu unless JOBS_LIMIT is also defined in poudriere. If ALLOW_MAKE_JOBS is not defined, then _MAKE_JOBS will equal nothing.
My guess is that ALLOW_MAKE_JOBS was defined in poudriere.conf and JOBS_LIMIT wasn't.
"MAKE_JOBS" doesn't come into the equation at all. Nothing looks at it, not even bsd.port.mk. And don't forget, the host make.conf is not used, ever.
poudriere.conf on this machine contains:
#ALLOW_MAKE_JOBS=yes
JOBS_LIMIT=2
The system has already died twice under loads caused by an excessive number of compilation processes.
Individual packages should never be allowed to choose the number of parallel jobs to run by themselves, this is effectively a form of denial-of-service.
JFI, this machine has 32 hardware threads and was running 32 separate poudriere jobs. If every poudriere job is allowed to use kern.ncpu to choose its parallelism level, there could be up to 32*32 = 1024 processes running at once...
Francois Tigeot
As stated on IRC, gcc-aux is not defining the "-j" parameter, that's coming from bsd.port.mk as a function of several configuration variables.
The evidence suggests that ALLOW_MAKE_JOBS was defined when gcc-aux was running because if it wasn't, _MAKE_JOBS would have been set a null value rather than the default "-j/sbin/sysctl -n hw.ncpu" value (which is overridden by JOBS_LIMIT anyway, so at worst it would have been "-j2")
Regardless of the original issue, I now see there's a fundamental problem with the way bsd.ports.mk handles jobs parallelism. I will fill a separate issue.
I found a poudriere build machine completely unresponsive with a load average of 150+. There were more than 100 gnat1 or ada processes running at the same time even though MAKE_JOBS was set to 2 and there were only 16 poudriere jobs.