Closed kliem closed 3 years ago
Description changed:
---
+++
@@ -2,7 +2,7 @@
Each thread has its private structure with which it does partial jobs. Depending on the parallelization depth, there is one job per face of fixed codimension (usually 1,2 or 3). After everything has finished, the partial f-vectors will be added.
-Actually, every face is visited and thus the code could be modified in the future, to explore other properties of faces then just the dimension. The parallelization seems to work well with at least 40 threads.
+Actually, every face is visited and thus the code could be modified in the future, to explore other properties of faces then just the dimension. The parallelization seems to work well with at least 40 threads (for computations taking long enough such that this pays off, see https://arxiv.org/pdf/1905.01945.pdf).
Also the algorithm does work in other situations (simplicial complex, lattice of flats of a matroid) and this parallel structure could be used for this as well.
The sig_check()
outside the loop is unnecessary:
cdef size_t j
+ sig_check()
for j in range(n_faces):
+ sig_check()
if (is_not_maximal_fused(new_faces, j, algorithm) or # Step 2
is_contained_in_one_fused(new_faces.faces[j], visited_all, algorithm)): # Step 3
is_not_new_face[j] = True
Likewise here:
face_list_t visited_all) nogil except -1:
cdef size_t output
- sig_on()
+ sig_check()
if faces.polyhedron_is_simple:
output = get_next_level_fused(faces, new_faces, visited_all, <simple> 0)
else:
output = get_next_level_fused(faces, new_faces, visited_all, <standard> 0)
- sig_off()
return output
cdef inline size_t bit_rep_to_coatom_rep(face_t face, face_list_t coatoms, size_t *output):
Here you either need the sig_on/sig_off
or some more targeted checking. I don't see the point in performing a single check here.
Otherwise LGTM.
Reviewer: Travis Scrimshaw
Branch pushed to git repo; I updated commit sha1. New commits:
0dae7a0 | remove redundant sig_checks |
Dependencies: #31226
Replying to @tscrim:
The
sig_check()
outside the loop is unnecessary:cdef size_t j + sig_check() for j in range(n_faces): + sig_check() if (is_not_maximal_fused(new_faces, j, algorithm) or # Step 2 is_contained_in_one_fused(new_faces.faces[j], visited_all, algorithm)): # Step 3 is_not_new_face[j] = True
Likewise here:
face_list_t visited_all) nogil except -1: cdef size_t output - sig_on() + sig_check() if faces.polyhedron_is_simple: output = get_next_level_fused(faces, new_faces, visited_all, <simple> 0) else: output = get_next_level_fused(faces, new_faces, visited_all, <standard> 0) - sig_off() return output cdef inline size_t bit_rep_to_coatom_rep(face_t face, face_list_t coatoms, size_t *output):
Here you either need the
sig_on/sig_off
or some more targeted checking. I don't see the point in performing a single check here.Otherwise LGTM.
I removed the redundant sig_checks.
As mentioned, sig_on, sig_off doesn't work well with parallelization. It crashes hard because sig_on, sig_off doesn't appear in correct order anymore.
Wrapping prange
in sig_on/sig_off has a large performance penalty (8.45 s -> 9 s).
Currently there is one sig_check
before checking that a face is inclusion maximal. One can add more sig checks, which wouldn't affect performance much for large examples, but for the small examples there is a penalty of about 50 percent.
I think the current solution should be fine: The largest thing I ever computed with this algorithm, which took about a month on 40 cores, had 11,665,781 vertices and 162 facets. In this case between the sig_checks A &~ B == 0
is performed for about 2 billion bits. Currently, without intrinsics, I get:
sage: B = Bitset((randint(0,2**20) for _ in range(2**19)))
sage: %timeit B.issubset(B)
6.18 µs ± 9.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
sage: %timeit B.intersection_update(B)
6.02 µs ± 2.72 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
On this rate performing this for 2 billion bits should take about 12 ms. And even if the example has many more coatoms, in the worst case (completely unrealistic, because the dimension is not 1 and more memory is needed for other dimensions and probably for other threads), there are bitwise operations performed twice for the entire RAM. Say, we have 1 TB RAM, this would then take 96 seconds. Okay, that is long, but in any realistic sceanrio it would only take about a second, even with 1 TB RAM.
To sum up, I think the current amount of sig_checks
suffices for the next years until more than 1 TB RAM is common (and even then, it doubt that there is an application for such a large example, as it would take forever to compute).
Rebased.
I didn't have this ticket here ready when writing #30445.
Eventually, one could also store edges/ridges or any sort of face in parallel. This would also help with applications for simplicial complexes and lattice of flats of manifolds.
But that should be left for another time.
New commits:
6a2a9b4 | fix merge conflict with develop in particual #30445 |
928ea60 | remove redundant allocation |
Thank you. This is quite a nice improvement.
Thank you.
On the kucalc buildbot:
[sagelib-9.3.beta6] [1/1] gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wno-unused -O2 -g -march=native -fPIC -I/var/lib/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/cysignals -I./sage/data_structures -I./sage/cpython -Isage/cpython -Isage/data_structures -I/var/lib/buildbot/slave/sage_git/build/build/pkgs/sagelib/src -I/var/lib/buildbot/slave/sage_git/build/local/include/python3.9 -I/var/lib/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/numpy/core/include -Ibuild/cythonized -I/var/lib/buildbot/slave/sage_git/build/local/include/python3.9 -c build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c -o build/temp.linux-x86_64-3.9/build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.o -fno-strict-aliasing -DCYTHON_CLINE_IN_TRACEBACK=1 -fopenmp -std=c99
[sagelib-9.3.beta6] In file included from build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:660:0:
[sagelib-9.3.beta6] /var/lib/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/cysignals/struct_signals.h:45:1: sorry, unimplemented: ‘_Atomic’ with OpenMP
[sagelib-9.3.beta6] typedef volatile _Atomic int cy_atomic_int;
[sagelib-9.3.beta6] ^
[sagelib-9.3.beta6] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c: In function ‘__pyx_f_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_13face_iterator_prepare_face_iterator_for_partial_job’:
[sagelib-9.3.beta6] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9035:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[sagelib-9.3.beta6] __pyx_t_2 = ((__pyx_v_structure->current_dimension == (__pyx_v_structure->dimension - __pyx_v_parallelization_depth)) != 0);
[sagelib-9.3.beta6] ^
[sagelib-9.3.beta6] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9322:84: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[sagelib-9.3.beta6] __pyx_t_1 = (((__pyx_v_parallel_struct->current_job_id[__pyx_v_current_depth]) == -1L) != 0);
[sagelib-9.3.beta6] ^
[sagelib-9.3.beta6] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9673:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[sagelib-9.3.beta6] __pyx_t_1 = ((__pyx_v_structure->current_dimension != ((__pyx_v_structure->dimension - __pyx_v_parallelization_depth) - 1)) != 0);
[sagelib-9.3.beta6] ^
[sagelib-9.3.beta6] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c: In function ‘__Pyx_InitGlobals’:
[sagelib-9.3.beta6] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:23283:1: warning: ‘PyEval_InitThreads’ is deprecated [-Wdeprecated-declarations]
[sagelib-9.3.beta6] PyEval_InitThreads();
[sagelib-9.3.beta6] ^
[sagelib-9.3.beta6] In file included from /var/lib/buildbot/slave/sage_git/build/local/include/python3.9/Python.h:145:0,
[sagelib-9.3.beta6] from build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:52:
[sagelib-9.3.beta6] /var/lib/buildbot/slave/sage_git/build/local/include/python3.9/ceval.h:130:37: note: declared here
[sagelib-9.3.beta6] Py_DEPRECATED(3.9) PyAPI_FUNC(void) PyEval_InitThreads(void);
[sagelib-9.3.beta6] ^
[sagelib-9.3.beta6] error: command '/usr/bin/gcc' failed with exit code 1
This appears to be an issue with python 3.5 to 3.7 according to https://bugs.python.org/issue25150
From this I get that this can be "fixed" by doing this specific module as a C++ extension (or by requiring system python >= 3.8, but that isn't an option for the moment).
I'll check if I can reproduce this with system python 3.7 (need to rebuild first).
Actually, that is a problem with cysignals that should be fixed there.
Branch pushed to git repo; I updated commit sha1. New commits:
baf80de | try disabling CYSIGNALS_C_ATOMIC |
According to
https://github.com/sagemath/cysignals/blob/master/src/cysignals/struct_signals.h
and
https://github.com/sagemath/cysignals/blob/master/src/cysignals/cysignals_config.h.in
just undefininig CYSIGNALS_C_ATOMIC
should fix this.
Back off to the builbots.
Build fails on various buildbots:
[sagelib-9.3.beta7] [1/1] gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wno-unused -O2 -g -march=native -O2 -g -march=native -fPIC -UCYSIGNALS_C_ATOMIC -I/home/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/cysignals -I./sage/data_structures -I./sage/cpython -Isage/cpython -I/home/buildbot/slave/sage_git/build/build/pkgs/sagelib/src -I/home/buildbot/slave/sage_git/build/local/include/python3.9 -I/home/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/numpy/core/include -Ibuild/cythonized -I/home/buildbot/slave/sage_git/build/local/include/python3.9 -c build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c -o build/temp.linux-x86_64-3.9/build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.o -fno-strict-aliasing -DCYTHON_CLINE_IN_TRACEBACK=1 -fopenmp -std=c99
[sagelib-9.3.beta7] In file included from build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:662:0:
[sagelib-9.3.beta7] /home/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/cysignals/struct_signals.h:45:1: sorry, unimplemented: ‘_Atomic’ with OpenMP
[sagelib-9.3.beta7] typedef volatile _Atomic int cy_atomic_int;
[sagelib-9.3.beta7] ^~~~~~~
[sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c: In function ‘__pyx_f_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_13face_iterator_prepare_face_iterator_for_partial_job’:
[sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9037:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[sagelib-9.3.beta7] __pyx_t_2 = ((__pyx_v_structure->current_dimension == (__pyx_v_structure->dimension - __pyx_v_parallelization_depth)) != 0);
[sagelib-9.3.beta7] ^~
[sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9324:84: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[sagelib-9.3.beta7] __pyx_t_1 = (((__pyx_v_parallel_struct->current_job_id[__pyx_v_current_depth]) == -1L) != 0);
[sagelib-9.3.beta7] ^~
[sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9675:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[sagelib-9.3.beta7] __pyx_t_1 = ((__pyx_v_structure->current_dimension != ((__pyx_v_structure->dimension - __pyx_v_parallelization_depth) - 1)) != 0);
[sagelib-9.3.beta7] ^~
[sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c: In function ‘__Pyx_InitGlobals’:
[sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:23285:1: warning: ‘PyEval_InitThreads’ is deprecated [-Wdeprecated-declarations]
[sagelib-9.3.beta7] PyEval_InitThreads();
[sagelib-9.3.beta7] ^~~~~~~~~~~~~~~~~~
[sagelib-9.3.beta7] In file included from /home/buildbot/slave/sage_git/build/local/include/python3.9/Python.h:145:0,
[sagelib-9.3.beta7] from build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:54:
[sagelib-9.3.beta7] /home/buildbot/slave/sage_git/build/local/include/python3.9/ceval.h:130:37: note: declared here
[sagelib-9.3.beta7] Py_DEPRECATED(3.9) PyAPI_FUNC(void) PyEval_InitThreads(void);
[sagelib-9.3.beta7] ^~~~~~~~~~~~~~~~~~
[sagelib-9.3.beta7] error: command '/usr/bin/gcc' failed with exit code 1
[sagelib-9.3.beta7]
[sagelib-9.3.beta7] real 0m3.325s
[sagelib-9.3.beta7] user 0m2.798s
[sagelib-9.3.beta7] sys 0m0.563s
Makefile:2233: recipe for target 'sagelib-no-deps' failed
Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:
da65d28 | Revert "try disabling CYSIGNALS_C_ATOMIC" |
Changed dependencies from #31226 to #31226, #31455
I don't think this is going to happen with 9.3 anymore, if this is waiting on a cysignals patch/update.
Branch pushed to git repo; I updated commit sha1. New commits:
405ef3e | Merge branch 'u/gh-kliem/first_parallel_version_of_face_iterator_reb' of git://trac.sagemath.org/sage into u/gh-kliem/first_parallel_version_of_face_iterator_reb2 |
This bug isn't too hard to reproduce actually:
https://github.com/kliem/sage/runs/2064520070?check_suite_focus=true
Seems to be connected to older GCC.
Replying to @vbraun:
Build fails on various buildbots:
[sagelib-9.3.beta7] [1/1] gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wno-unused -O2 -g -march=native -O2 -g -march=native -fPIC -UCYSIGNALS_C_ATOMIC -I/home/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/cysignals -I./sage/data_structures -I./sage/cpython -Isage/cpython -I/home/buildbot/slave/sage_git/build/build/pkgs/sagelib/src -I/home/buildbot/slave/sage_git/build/local/include/python3.9 -I/home/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/numpy/core/include -Ibuild/cythonized -I/home/buildbot/slave/sage_git/build/local/include/python3.9 -c build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c -o build/temp.linux-x86_64-3.9/build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.o -fno-strict-aliasing -DCYTHON_CLINE_IN_TRACEBACK=1 -fopenmp -std=c99 [sagelib-9.3.beta7] In file included from build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:662:0: [sagelib-9.3.beta7] /home/buildbot/slave/sage_git/build/local/lib/python3.9/site-packages/cysignals/struct_signals.h:45:1: sorry, unimplemented: ‘_Atomic’ with OpenMP [sagelib-9.3.beta7] typedef volatile _Atomic int cy_atomic_int; [sagelib-9.3.beta7] ^~~~~~~ [sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c: In function ‘__pyx_f_4sage_8geometry_10polyhedron_24combinatorial_polyhedron_13face_iterator_prepare_face_iterator_for_partial_job’: [sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9037:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] [sagelib-9.3.beta7] __pyx_t_2 = ((__pyx_v_structure->current_dimension == (__pyx_v_structure->dimension - __pyx_v_parallelization_depth)) != 0); [sagelib-9.3.beta7] ^~ [sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9324:84: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] [sagelib-9.3.beta7] __pyx_t_1 = (((__pyx_v_parallel_struct->current_job_id[__pyx_v_current_depth]) == -1L) != 0); [sagelib-9.3.beta7] ^~ [sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:9675:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] [sagelib-9.3.beta7] __pyx_t_1 = ((__pyx_v_structure->current_dimension != ((__pyx_v_structure->dimension - __pyx_v_parallelization_depth) - 1)) != 0); [sagelib-9.3.beta7] ^~ [sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c: In function ‘__Pyx_InitGlobals’: [sagelib-9.3.beta7] build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:23285:1: warning: ‘PyEval_InitThreads’ is deprecated [-Wdeprecated-declarations] [sagelib-9.3.beta7] PyEval_InitThreads(); [sagelib-9.3.beta7] ^~~~~~~~~~~~~~~~~~ [sagelib-9.3.beta7] In file included from /home/buildbot/slave/sage_git/build/local/include/python3.9/Python.h:145:0, [sagelib-9.3.beta7] from build/cythonized/sage/geometry/polyhedron/combinatorial_polyhedron/face_iterator.c:54: [sagelib-9.3.beta7] /home/buildbot/slave/sage_git/build/local/include/python3.9/ceval.h:130:37: note: declared here [sagelib-9.3.beta7] Py_DEPRECATED(3.9) PyAPI_FUNC(void) PyEval_InitThreads(void); [sagelib-9.3.beta7] ^~~~~~~~~~~~~~~~~~ [sagelib-9.3.beta7] error: command '/usr/bin/gcc' failed with exit code 1 [sagelib-9.3.beta7] [sagelib-9.3.beta7] real 0m3.325s [sagelib-9.3.beta7] user 0m2.798s [sagelib-9.3.beta7] sys 0m0.563s Makefile:2233: recipe for target 'sagelib-no-deps' failed
Changed dependencies from #31226, #31455 to #31474
This might need more work, as the flag -fopenmp
isn't understood everywhere: https://github.com/kliem/sage/pull/40/checks?check_run_id=2076988665
So we should probably check out at configure time, whether our compiler supports this, similar to https://github.com/sagemath/sage-prod/issues/27122.
Changed dependencies from #31474 to #31474, #31499
Branch pushed to git repo; I updated commit sha1. New commits:
55a4634 | merge with current develop |
6edd48b | define cython aliases OPENMP_CFLAGS and OPENMP_CXXFLAGS |
71f8950 | fixes to get the aliases to work |
b76dc3c | add doctest that shows how use cython.parallel.prange |
07cb470 | Merge branch 'u/gh-kliem/check_openmp_at_configuration' of git://trac.sagemath.org/sage into u/gh-kliem/first_parallel_version_of_face_iterator_reb |
27baa07 | use OPENMP_CFLAGS |
Thanks again.
Merge conflict
New commits:
2474ccc | Merge branch 'u/gh-kliem/first_parallel_version_of_face_iterator_reb' of git://trac.sagemath.org/sage into u/gh-kliem/first_parallel_version_of_face_iterator_reb2 |
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
499af5d | remove SAGE_HAVE_OPENMP |
7b3da21 | move to sage_conf.py.in |
927291f | simplifications |
d71c54a | set default |
7d3f230 | more solid check |
4f42440 | sage.env.cython_aliases: Do not fail if one of the listed libraries is not known to pkgconfig |
3cb8f90 | sage.env.cython_aliases: Make module lapack optional; generalize the interface |
c3cc31e | merge in 31384 |
70a8791 | Merge branch 'u/gh-kliem/check_openmp_at_configuration' of git://trac.sagemath.org/sage into u/gh-kliem/first_parallel_version_of_face_iterator_reb2 |
fdbe95f | use cython.parallel.threadid instead of omp_get_thread_num |
I'm not sure this was actually a merge conflict.
But even on my laptop with OPENMP support I got missing symbols. Don't exactly understand this.
However, we can't use openmp.omp_get_num_threads
for obvious reasons, because it doesn't have to be defined. Instead we should use cython.parallel.threadid
, which takes care of this. It is supposed to work, just like prange.
Works for me. Let's try this again...
Merge conflict
Changed dependencies from #31474, #31499 to #31499
This ticket parallelizes the f-vector for polytopes.
Each thread has its private structure with which it does partial jobs. Depending on the parallelization depth, there is one job per face of fixed codimension (usually 1,2 or 3). After everything has finished, the partial f-vectors will be added.
Actually, every face is visited and thus the code could be modified in the future, to explore other properties of faces then just the dimension. The parallelization seems to work well with at least 40 threads (for computations taking long enough such that this pays off, see https://arxiv.org/pdf/1905.01945.pdf).
Also the algorithm does work in other situations (simplicial complex, lattice of flats of a matroid) and this parallel structure could be used for this as well.
On the downside,
sig_on()/sig_off()
doesn't work with with multiple threads and has to be replaced by a simplesig_check()
. Also raising errors in parallel code results in terrible slow down. Hence the errors are replaced by returning the exception value. In case of a bug there won't be any traceback anymore, but at least also no segmenation fault.Before:
After (machine has 4 cores):
CC: @jplab @LaisRast @stumpc5 @tscrim
Component: geometry
Keywords: parallel f-vector
Author: Jonathan Kliem
Branch/Commit:
4c0a4ae
Reviewer: Travis Scrimshaw
Issue created by migration from https://trac.sagemath.org/ticket/31245