sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 480 forks source link

Implement parallel f-vector for polytopes #31245

Closed kliem closed 3 years ago

kliem commented 3 years ago

This ticket parallelizes the f-vector for polytopes.

Each thread has its private structure with which it does partial jobs. Depending on the parallelization depth, there is one job per face of fixed codimension (usually 1,2 or 3). After everything has finished, the partial f-vectors will be added.

Actually, every face is visited and thus the code could be modified in the future, to explore other properties of faces then just the dimension. The parallelization seems to work well with at least 40 threads (for computations taking long enough such that this pays off, see https://arxiv.org/pdf/1905.01945.pdf).

Also the algorithm does work in other situations (simplicial complex, lattice of flats of a matroid) and this parallel structure could be used for this as well.

On the downside, sig_on()/sig_off() doesn't work with with multiple threads and has to be replaced by a simple sig_check(). Also raising errors in parallel code results in terrible slow down. Hence the errors are replaced by returning the exception value. In case of a bug there won't be any traceback anymore, but at least also no segmenation fault.

Before:

sage: P = P.base_extend(QQ)                                                                                                                                   
sage: P = P.base_extend(QQ)                                                                                                                                   
sage: Q = P.join(P.polar(in_affine_span=True))                                                                                                                
sage: C = CombinatorialPolyhedron(Q)                                                                                                                          
sage: %time C.f_vector()                                                                                                                                      
CPU times: user 111 ms, sys: 186 µs, total: 111 ms
Wall time: 111 ms
(1, 150, 3990, 25590, 69450, 95402, 69450, 25590, 3990, 150, 1)

sage: P = polytopes.Birkhoff_polytope(5)                                                                                                                      
sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector()                                                                                                                                      
CPU times: user 584 ms, sys: 25 µs, total: 584 ms
Wall time: 583 ms
(1, 120, 5040, 50250, 233400, 631700, 1113700, 1367040, 1220550, 817150, 419225, 167200, 52120, 12600, 2300, 300, 25, 1)

# Using the <simple> version of the algorithm.
sage: P = polytopes.associahedron(['A', 11], backend='normaliz')                                                                                              
sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector()                                                                                                                                      
CPU times: user 37.9 s, sys: 17.2 ms, total: 37.9 s
Wall time: 37.9 s
(1, 208012, 1144066, 2735810, 3730650, 3197700, 1790712, 659736, 157080, 23100, 1925, 77, 1)

After (machine has 4 cores):

sage: P = polytopes.permutahedron(5)                                                                                                                          
sage: P = P.base_extend(QQ)                                                                                                                                   
sage: Q = P.join(P.polar(in_affine_span=True))                                                                                                                
sage: C = CombinatorialPolyhedron(Q)                                                                                                                          
sage: %time C.f_vector(num_threads=1)                                                                                                                         
CPU times: user 107 ms, sys: 0 ns, total: 107 ms
Wall time: 107 ms
(1, 150, 3990, 25590, 69450, 95402, 69450, 25590, 3990, 150, 1)

sage: C = CombinatorialPolyhedron(Q)                                                                                                                          
sage: %time C.f_vector(num_threads=2)                                                                                                                         
CPU times: user 108 ms, sys: 0 ns, total: 108 ms
Wall time: 55.6 ms
(1, 150, 3990, 25590, 69450, 95402, 69450, 25590, 3990, 150, 1)

sage: C = CombinatorialPolyhedron(Q)                                                                                                                          
sage: %time C.f_vector(num_threads=4)                                                                                                                         
CPU times: user 147 ms, sys: 52 µs, total: 147 ms
Wall time: 38.6 ms
(1, 150, 3990, 25590, 69450, 95402, 69450, 25590, 3990, 150, 1)

sage: C = CombinatorialPolyhedron(Q)                                                                                                                          
sage: %time C.f_vector(num_threads=8)                                                                                                                         
CPU times: user 236 ms, sys: 0 ns, total: 236 ms
Wall time: 31.2 ms
(1, 150, 3990, 25590, 69450, 95402, 69450, 25590, 3990, 150, 1)

sage: P = polytopes.Birkhoff_polytope(5)                                                                                                                      
sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=1)                                                                                                                         
CPU times: user 354 ms, sys: 0 ns, total: 354 ms
Wall time: 354 ms
(1, 120, 5040, 50250, 233400, 631700, 1113700, 1367040, 1220550, 817150, 419225, 167200, 52120, 12600, 2300, 300, 25, 1)

sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=2)                                                                                                                         
CPU times: user 363 ms, sys: 0 ns, total: 363 ms
Wall time: 181 ms
(1, 120, 5040, 50250, 233400, 631700, 1113700, 1367040, 1220550, 817150, 419225, 167200, 52120, 12600, 2300, 300, 25, 1)

sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=4)                                                                                                                         
CPU times: user 459 ms, sys: 0 ns, total: 459 ms
Wall time: 117 ms
(1, 120, 5040, 50250, 233400, 631700, 1113700, 1367040, 1220550, 817150, 419225, 167200, 52120, 12600, 2300, 300, 25, 1)

sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=8)                                                                                                                         
CPU times: user 776 ms, sys: 154 µs, total: 776 ms
Wall time: 103 ms
(1, 120, 5040, 50250, 233400, 631700, 1113700, 1367040, 1220550, 817150, 419225, 167200, 52120, 12600, 2300, 300, 25, 1)

# Using the <simple> version of the algorithm.
sage: P = polytopes.associahedron(['A', 11], backend='normaliz')                                                                                              
sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=1)                                                                                                                         
CPU times: user 33.5 s, sys: 0 ns, total: 33.5 s
Wall time: 33.5 s
(1, 208012, 1144066, 2735810, 3730650, 3197700, 1790712, 659736, 157080, 23100, 1925, 77, 1)

sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=2)                                                                                                                         
CPU times: user 34.4 s, sys: 3.49 ms, total: 34.4 s
Wall time: 17.2 s
(1, 208012, 1144066, 2735810, 3730650, 3197700, 1790712, 659736, 157080, 23100, 1925, 77, 1)

sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=4)                                                                                                                         
CPU times: user 35.9 s, sys: 15.5 ms, total: 35.9 s
Wall time: 9 s
(1, 208012, 1144066, 2735810, 3730650, 3197700, 1790712, 659736, 157080, 23100, 1925, 77, 1)

sage: C = CombinatorialPolyhedron(P)                                                                                                                          
sage: %time C.f_vector(num_threads=8)                                                                                                                         
CPU times: user 1min 6s, sys: 31.3 ms, total: 1min 6s
Wall time: 8.44 s
(1, 208012, 1144066, 2735810, 3730650, 3197700, 1790712, 659736, 157080, 23100, 1925, 77, 1)

CC: @jplab @LaisRast @stumpc5 @tscrim

Component: geometry

Keywords: parallel f-vector

Author: Jonathan Kliem

Branch/Commit: 4c0a4ae

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/31245

mkoeppe commented 3 years ago

New commits:

4ae6966Merge tag '9.4.beta0' into t/31245/first_parallel_version_of_face_iterator_reb2
vbraun commented 3 years ago
comment:42

Segfaults reliably on OSX

sage -t --long --random-seed=0 src/sage/geometry/polyhedron/combinatorial_polyhedron/base.pyx  # Killed due to segmentation fault
sage -t --long --random-seed=0 src/sage/geometry/polyhedron/base.py  # Killed due to segmentation fault

Debugger:

$ sage -sh
(sage-sh) sudo lldb ./local/bin/python3
(lldb) target create "./local/bin/python3"
Current executable set to '/Users/buildbot-sage/slave/sage_git/build/local/bin/python3' (x86_64).
(lldb) rfrom
error: 'rfrom' is not a valid command.
(lldb) r
Process 81901 launched: '/Users/buildbot-sage/slave/sage_git/build/local/bin/python3' (x86_64)
Python 3.9.5 (default, Jun  8 2021, 19:13:24) 
[Clang 12.0.5 (clang-1205.0.22.9)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sage.all import *
>>> line = Polyhedron(lines=[[0,1]])
>>> line.vertex_graph() ## line 7185 ##
base.cpython-39-darwin.so was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 81901 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001142ae934 base.cpython-39-darwin.so`__pyx_gb_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_24_compute_edges_or_ridges_5generator29(__pyx_generator=0x0000000114562160, __pyx_tstate=0x0000000100508140, __pyx_sent_value=0x00000001003a8580) at base.c:28817:74 [opt]
   28814      __pyx_t_2 = __pyx_t_1;
   28815      for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=1) {
   28816        __pyx_cur_scope->__pyx_v_i = __pyx_t_3;
-> 28817        __pyx_t_4 = ((PyObject *)__pyx_f_4sage_5rings_7integer_smallInteger((__pyx_cur_scope->__pyx_outer_scope->__pyx_v_f_vector[__pyx_cur_scope->__pyx_v_i]))); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 3071, __pyx_L1_error)
   28818        __Pyx_GOTREF(__pyx_t_4);
   28819        __pyx_r = __pyx_t_4;
   28820        __pyx_t_4 = 0;
Target 0: (python3) stopped.
(lldb) 
error: No auto repeat.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001142ae934 base.cpython-39-darwin.so`__pyx_gb_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_24_compute_edges_or_ridges_5generator29(__pyx_generator=0x0000000114562160, __pyx_tstate=0x0000000100508140, __pyx_sent_value=0x00000001003a8580) at base.c:28817:74 [opt]
    frame #1: 0x00000001033bc4fa cachefunc.cpython-39-darwin.so`__Pyx_Coroutine_SendEx(self=0x0000000114562160, value=0x00000001003a8580, closing=<unavailable>) at cachefunc.c:28545:14 [opt]
    frame #2: 0x00000001001573c0 libpython3.9.dylib`PySequence_Tuple + 144
    frame #3: 0x00000001142abde3 base.cpython-39-darwin.so`__pyx_f_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron__compute_edges_or_ridges(__pyx_v_self=<unavailable>, __pyx_v_dual=<unavailable>, __pyx_v_do_edges=<unavailable>) at base.c:29447:19 [opt]
    frame #4: 0x00000001142b7aa7 base.cpython-39-darwin.so`__pyx_pw_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_23edges [inlined] __pyx_f_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron__compute_edges(__pyx_v_self=<unavailable>, __pyx_v_dual=<unavailable>) at base.c:43309:15 [opt]
    frame #5: 0x00000001142b7a9b base.cpython-39-darwin.so`__pyx_pw_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_23edges [inlined] __pyx_pf_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_22edges(__pyx_v_self=0x0000000114235ca0, __pyx_v_names=0x0000000000000000) at base.c:14187 [opt]
    frame #6: 0x00000001142b742a base.cpython-39-darwin.so`__pyx_pw_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_23edges(__pyx_v_self=0x0000000114235ca0, __pyx_args=<unavailable>, __pyx_kwds=<unavailable>) at base.c:13668 [opt]
    frame #7: 0x00000001001ab7c5 libpython3.9.dylib`cfunction_call + 69
    frame #8: 0x00000001142ca4d7 base.cpython-39-darwin.so`__pyx_gb_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_12vertex_graph_2generator13 [inlined] __Pyx_PyObject_Call(func=0x0000000114569860, arg=<unavailable>, kw=0x0000000100722e80) at base.c:52465:14 [opt]
    frame #9: 0x00000001142ca49a base.cpython-39-darwin.so`__pyx_gb_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_12vertex_graph_2generator13(__pyx_generator=0x0000000112ae5b80, __pyx_tstate=<unavailable>, __pyx_sent_value=<unavailable>) at base.c:14600 [opt]
    frame #10: 0x00000001033bc4fa cachefunc.cpython-39-darwin.so`__Pyx_Coroutine_SendEx(self=0x0000000112ae5b80, value=0x00000001003a8580, closing=<unavailable>) at cachefunc.c:28545:14 [opt]
    frame #11: 0x00000001001573c0 libpython3.9.dylib`PySequence_Tuple + 144
    frame #12: 0x00000001142b86c8 base.cpython-39-darwin.so`__pyx_pw_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_25vertex_graph [inlined] __pyx_pf_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_24vertex_graph(__pyx_v_self=<unavailable>, __pyx_v_names=0x0000000000000000) at base.c:14808:15 [opt]
    frame #13: 0x00000001142b85bb base.cpython-39-darwin.so`__pyx_pw_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_4base_23CombinatorialPolyhedron_25vertex_graph(__pyx_v_self=<unavailable>, __pyx_args=<unavailable>, __pyx_kwds=<unavailable>) at base.c:14515 [opt]
    frame #14: 0x00000001001759c3 libpython3.9.dylib`method_vectorcall_VARARGS_KEYWORDS + 275
    frame #15: 0x000000010024bc1b libpython3.9.dylib`call_function + 411
    frame #16: 0x0000000100248c0b libpython3.9.dylib`_PyEval_EvalFrameDefault + 27147
    frame #17: 0x000000010016d5b5 libpython3.9.dylib`function_code_fastcall + 229
    frame #18: 0x000000010016f89c libpython3.9.dylib`method_vectorcall + 204
    frame #19: 0x000000010024bc1b libpython3.9.dylib`call_function + 411
    frame #20: 0x0000000100248c2e libpython3.9.dylib`_PyEval_EvalFrameDefault + 27182
    frame #21: 0x000000010024c9d4 libpython3.9.dylib`_PyEval_EvalCode + 2580
    frame #22: 0x0000000100242107 libpython3.9.dylib`PyEval_EvalCode + 87
    frame #23: 0x000000010028c45f libpython3.9.dylib`PyRun_InteractiveOneObjectEx + 847
    frame #24: 0x000000010028ba59 libpython3.9.dylib`PyRun_InteractiveLoopFlags + 169
    frame #25: 0x000000010028b97c libpython3.9.dylib`PyRun_AnyFileExFlags + 60
    frame #26: 0x00000001002a84ea libpython3.9.dylib`Py_RunMain + 2362
    frame #27: 0x00000001002a87ec libpython3.9.dylib`pymain_main + 348
    frame #28: 0x00000001002a883b libpython3.9.dylib`Py_BytesMain + 43
    frame #29: 0x00007fff20394621 libdyld.dylib`start + 1
    frame #30: 0x00007fff20394621 libdyld.dylib`start + 1
kliem commented 3 years ago
comment:43

Thanks Volker for the precise log.

Unfortunately, I neither understand the problem nor can reproduce it. I could just disable OpenMP with clang by default and hope this solves the problem.

kliem commented 3 years ago

Changed branch from u/mkoeppe/first_parallel_version_of_face_iterator_reb2 to u/gh-kliem/first_parallel_version_of_face_iterator_reb3

kliem commented 3 years ago

Changed commit from 4ae6966 to 4c0a4ae

kliem commented 3 years ago
comment:44

I think I tracked it down. I assumed that

cdef bint do_f_vector

is initialized to zero, which apparently isn't always the case.

Then Volkers report also makes sense.


New commits:

bfb4efbMerge branch 'u/mkoeppe/first_parallel_version_of_face_iterator_reb2' of git://trac.sagemath.org/sage into u/mkoeppe/first_parallel_version_of_face_iterator_reb3
4c0a4aeinitialize do_f_vector
kliem commented 3 years ago

Changed dependencies from #31499 to none

kliem commented 3 years ago
comment:45

I really don't understand why this pops out just now and not earlier.

vbraun commented 3 years ago
comment:46

Uninitialized variables are whatever the RAM content is when they enter scope. However, any RAM region that you get from the OS is zeroed out (so you can't read content from older processes). So uninitialized variables tend to be zero initially, but the longer the program runs the more likely it becomes that the variable occupies a previously-used memory location.

More precisely, only global and static C variables are guaranteed to be initialized to zero, local variables are not.

Valgrind can detect these things for you (i.e. when you read uninitialized memory)

kliem commented 3 years ago
comment:47

Replying to @vbraun:

Uninitialized variables are whatever the RAM content is when they enter scope. However, any RAM region that you get from the OS is zeroed out (so you can't read content from older processes). So uninitialized variables tend to be zero initially, but the longer the program runs the more likely it becomes that the variable occupies a previously-used memory location.

More precisely, only global and static C variables are guaranteed to be initialized to zero, local variables are not.

Valgrind can detect these things for you (i.e. when you read uninitialized memory)

Thank you for the explanation.

Once I found out where the problem lies, I wasn't that confused anymore. I originally thought it had something to do with the current ticket, but it doesn't, it just appears now, probably because it now gets stabely compiled in a slightly different way.

In this case it is actually a mistake I made (which I in theory knew about, when I made it). I never meant to assume anything on an unitialized bint: There where two cases and depending on the case I would set it to True and False and then I added special handling for corner cases and forgot to initialize the bint.

I didn't know that RAM is zeroed out before the process, but once you mention it, of course this is how it ought to be.

tscrim commented 3 years ago
comment:48

Hopefully this will be the last iteration of this. Sorry for having to redo this a bunch of times Volker; I just haven't been able to locally recreate any of the issues.

vbraun commented 3 years ago

Changed branch from u/gh-kliem/first_parallel_version_of_face_iterator_reb3 to 4c0a4ae