sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.39k stars 473 forks source link

parallelism in Sage: just use value of 'MAKE' #12016

Closed jhpalmieri closed 12 years ago

jhpalmieri commented 12 years ago

The various parallel aspects of Sage should be controlled by setting the -j (possible also -l) flags in MAKE or MAKEFLAGS. That is, if MAKE='make -j16', then

Testing this ticket: you can set the environment variable SAGE_NUM_CORES to the number of cores you want to pretend to have. For example, running

SAGE_NUM_CORES=24 make ptestlong

should run 8 threads (see sage-num-threads.py; this is undocumented because the only purpose I see is for testing this ticket).

Notes: With the patches applied, building spkgs in parallel works well, except for race conditions in:

Apply:

  1. attachment: 12016-root.patch to the SAGE_ROOT repository.
  2. attachment: 12016-base.patch to spkg/base.
  3. attachment: 12016-scripts.patch and attachment: trac_12016-scripts-ref.patch to the SCRIPTS repository.
  4. attachment: 12016-sage.patch to the Sage library.

See also: #6495 to implement the same behavior for doc building.

Dependencies: sage-4.8.alpha4

CC: @jdemeyer @nexttime

Component: build

Author: John Palmieri, Jeroen Demeyer

Reviewer: John Palmieri, Jeroen Demeyer

Merged: sage-4.8.alpha5

Issue created by migration from https://trac.sagemath.org/ticket/12016

jdemeyer commented 12 years ago
comment:41

When testing with sage -f, the proper way to test is using

MAKEFLAGS="j50" ./sage -f ...
jdemeyer commented 12 years ago

Changed dependencies from sage-4.8.alpha3 + #12096 to sage-4.8.alpha3 + #12096, #12137, #12138

jhpalmieri commented 12 years ago
comment:43

In the following lines from sage-spkg

# Handle -n, -t, -q options for recursive make 
# See Trac #12016. 
if echo "$MAKE $MAKEFLAGS -$MAKEFLAGS" |grep -e ' -[A-Za-z]*[qnt]' >/dev/null; then 
    if echo "$MAKE $MAKEFLAGS -$MAKEFLAGS" |grep -e ' -[A-Za-z]*q' >/dev/null; then 
        exit 1 
    else 
        exit 0 
    fi 
fi 

do we also need to handle the long versions? (I don't think so, but I thought I would ask.)

More importantly, on OpenSolaris, or at least on David Kirkby's machine hawk, the default 'grep' command doesn't take a -e option. Can we just omit it? The command still seems to function on sage.math, on OS X, and on OpenSolaris.

jhpalmieri commented 12 years ago
comment:44

I cannot explain why "make -j -lN" would fail but "make -jN" would work.

One reason is that make -j -lN puts a limit on starting new processes, and that might be what's causing the problems. I could force the old zlib spkg to fail on sage.math by running MAKEFLAGS='j -l2' ./sage -f ... but not with MAKEFLAGS='j -l30' .... I don't know if setting MAKE="$MAKE -j1 -l in spkg-install is the right way to fix this for problematic spkgs (like singular?), but it might be worth trying.

jdemeyer commented 12 years ago
comment:45

Replying to @jhpalmieri:

In the following lines from sage-spkg

# Handle -n, -t, -q options for recursive make 
# See Trac #12016. 
if echo "$MAKE $MAKEFLAGS -$MAKEFLAGS" |grep -e ' -[A-Za-z]*[qnt]' >/dev/null; then 
    if echo "$MAKE $MAKEFLAGS -$MAKEFLAGS" |grep -e ' -[A-Za-z]*q' >/dev/null; then 
        exit 1 
    else 
        exit 0 
    fi 
fi 

do we also need to handle the long versions? (I don't think so, but I thought I would ask.)

Well, this would only be needed if the user does something very silly like

MAKE="make --dry-run" ./sage -f ...

More importantly, on OpenSolaris, or at least on David Kirkby's machine hawk, the default 'grep' command doesn't take a -e option. Can we just omit it?

Probably yes, but it might be safer to replace the leading space by a [ ].

jdemeyer commented 12 years ago

Attachment: 12016-scripts.patch.gz

jdemeyer commented 12 years ago

Description changed:

--- 
+++ 
@@ -1,4 +1,4 @@
-With the attached patches, along with the changes from #11959, the various parallel aspects of Sage should be controlled by setting the `-j` flag in `MAKE`.  That is, if `MAKE='make -j16'`, then
+The various parallel aspects of Sage should be controlled by setting the `-j` (possible also `-l`) flags in `MAKE` or `MAKEFLAGS`.  That is, if `MAKE='make -j16'`, then

 - running `make` will build spkg's in parallel, using 16 processes (this was done in #11959).  This is standard `make` behaviour, but we need to patch `spkg/standard/deps` to ensure that `make` recognizes that we are doing a recursive make.

@@ -6,15 +6,23 @@

 - running `./sage -b` will build the Sage library using 16 threads. If the `-j` flag in `MAKE` is not set, then use only 1 thread.

-In #6495, we should implement the same behavior for doc building.
-
-Concerning testing this ticket: you can set the environment variable `SAGE_NUM_CORES` to the number of cores you want to pretend to have.  For example, running
+**Testing this ticket**: you can set the environment variable `SAGE_NUM_CORES` to the number of cores you want to pretend to have.  For example, running

SAGE_NUM_CORES=24 make ptestlong

 should run 8 threads (see `sage-num-threads.py`; this is undocumented because the only purpose I see is for testing this ticket).

+**Notes**:
+With the patches applied, building spkgs in parallel works well, except for race conditions in:
+* python (#12096)
+* singular (#12137)
+* zlib (#12138)
+* mpir (#12139)
+and a "jobserver unavailable" warning in:
+* ntl
+* singular
+* rubiks

 **Apply**:
 1. [attachment: 12016-root.patch](https://github.com/sagemath/sage-prod/files/10654001/12016-root.patch.gz) to the `SAGE_ROOT` repository.
@@ -22,8 +30,4 @@
 3. [attachment: 12016-scripts.patch](https://github.com/sagemath/sage-prod/files/10654003/12016-scripts.patch.gz) to the `SCRIPTS` repository.
 4. [attachment: 12016-sage.patch](https://github.com/sagemath/sage-prod/files/10654002/12016-sage.patch.gz) to the Sage library.

-**Notes**:
-With the patches applied, building spkgs in parallel works well, except for a "jobserver unavailable" warning in:
-* ntl
-* singular
-* rubiks
+See also: #6495 to implement the same behavior for doc building.
jdemeyer commented 12 years ago
comment:46

Attachment: 12016-base.patch.gz

jdemeyer commented 12 years ago

Description changed:

--- 
+++ 
@@ -21,7 +21,7 @@
 * mpir (#12139)
 and a "jobserver unavailable" warning in:
 * ntl
-* singular
+* singular (#12137)
 * rubiks

 **Apply**:
jdemeyer commented 12 years ago

Changed dependencies from sage-4.8.alpha3 + #12096, #12137, #12138 to sage-4.8.alpha3 + #12096, #12137, #12138, #12139

jhpalmieri commented 12 years ago

Description changed:

--- 
+++ 
@@ -27,7 +27,7 @@
 **Apply**:
 1. [attachment: 12016-root.patch](https://github.com/sagemath/sage-prod/files/10654001/12016-root.patch.gz) to the `SAGE_ROOT` repository.
 2. [attachment: 12016-base.patch](https://github.com/sagemath/sage-prod/files/10654004/12016-base.patch.gz) to `spkg/base`.
-3. [attachment: 12016-scripts.patch](https://github.com/sagemath/sage-prod/files/10654003/12016-scripts.patch.gz) to the `SCRIPTS` repository.
+3. [attachment: 12016-scripts.patch](https://github.com/sagemath/sage-prod/files/10654003/12016-scripts.patch.gz) and [attachment: trac_12016-scripts-ref.patch](https://github.com/sagemath/sage-prod/files/10654005/trac_12016-scripts-ref.patch.gz) to the `SCRIPTS` repository.
 4. [attachment: 12016-sage.patch](https://github.com/sagemath/sage-prod/files/10654002/12016-sage.patch.gz) to the Sage library.

 See also: #6495 to implement the same behavior for doc building.
jhpalmieri commented 12 years ago
comment:49

I'm happy with this except for a few small changes I want to make in sage-num-threads.py: we should use subprocess instead of popen, since popen has been deprecated. Also, we should catch errors if sysctl fails to run -- it's not present on all platforms. Finally, we might as well search for max-load in addition to load-average. See the referee patch. If you're happy with that, the whole thing can get a positive review.

jhpalmieri commented 12 years ago

Attachment: trac_12016-scripts-ref.patch.gz

scripts repo

jdemeyer commented 12 years ago
comment:50

Looks good to me.

I am still slightly worried about the intermittent sage0.py doctest failures though...

jdemeyer commented 12 years ago

Changed dependencies from sage-4.8.alpha3 + #12096, #12137, #12138, #12139 to sage-4.8.alpha4

jdemeyer commented 12 years ago

Merged: sage-4.8.alpha5

jdemeyer commented 12 years ago

Description changed:

--- 
+++ 
@@ -19,6 +19,7 @@
 * singular (#12137)
 * zlib (#12138)
 * mpir (#12139)
+* atlas on Solaris (#12312)
 and a "jobserver unavailable" warning in:
 * ntl
 * singular (#12137)