DragonFlyBSD / DPorts

The dedicated application build system for DragonFly BSD
Other
91 stars 43 forks source link

lang/gcc-aux doesn't respect MAKE_JOBS #8

Closed ftigeot closed 11 years ago

ftigeot commented 11 years ago

I found a poudriere build machine completely unresponsive with a load average of 150+. There were more than 100 gnat1 or ada processes running at the same time even though MAKE_JOBS was set to 2 and there were only 16 poudriere jobs.

jrmarino commented 11 years ago

confirmed. "make -V_MAKE_JOBS" produces: -j/sbin/sysctl -n hw.ncpu

I'll look for another variable. This may be a legacy.

jrmarino commented 11 years ago

This is starting to look like user error. Here is the logic: if ALLOW_MAKE_JOBS is defined in poudriere.conf, _MAKE_JOBS will default to hw.ncpu unless JOBS_LIMIT is also defined in poudriere. If ALLOW_MAKE_JOBS is not defined, then _MAKE_JOBS will equal nothing.

My guess is that ALLOW_MAKE_JOBS was defined in poudriere.conf and JOBS_LIMIT wasn't.

"MAKE_JOBS" doesn't come into the equation at all. Nothing looks at it, not even bsd.port.mk. And don't forget, the host make.conf is not used, ever.

ftigeot commented 11 years ago

On Sun, Feb 10, 2013 at 01:49:22AM -0800, jrmarino wrote:

This is starting to look like user error. Here is the logic: if ALLOW_MAKE_JOBS is defined in poudriere.conf, _MAKE_JOBS will default to hw.ncpu unless JOBS_LIMIT is also defined in poudriere. If ALLOW_MAKE_JOBS is not defined, then _MAKE_JOBS will equal nothing.

My guess is that ALLOW_MAKE_JOBS was defined in poudriere.conf and JOBS_LIMIT wasn't.

"MAKE_JOBS" doesn't come into the equation at all. Nothing looks at it, not even bsd.port.mk. And don't forget, the host make.conf is not used, ever.

poudriere.conf on this machine contains:

#ALLOW_MAKE_JOBS=yes
JOBS_LIMIT=2

The system has already died twice under loads caused by an excessive number of compilation processes.

Individual packages should never be allowed to choose the number of parallel jobs to run by themselves, this is effectively a form of denial-of-service.

JFI, this machine has 32 hardware threads and was running 32 separate poudriere jobs. If every poudriere job is allowed to use kern.ncpu to choose its parallelism level, there could be up to 32*32 = 1024 processes running at once...

Francois Tigeot

jrmarino commented 11 years ago

As stated on IRC, gcc-aux is not defining the "-j" parameter, that's coming from bsd.port.mk as a function of several configuration variables.

The evidence suggests that ALLOW_MAKE_JOBS was defined when gcc-aux was running because if it wasn't, _MAKE_JOBS would have been set a null value rather than the default "-j/sbin/sysctl -n hw.ncpu" value (which is overridden by JOBS_LIMIT anyway, so at worst it would have been "-j2")

ftigeot commented 11 years ago

Regardless of the original issue, I now see there's a fundamental problem with the way bsd.ports.mk handles jobs parallelism. I will fill a separate issue.