sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.34k stars 452 forks source link

Remove all the bitset access from cython files in combinatorial polyhedron #30549

Closed kliem closed 3 years ago

kliem commented 4 years ago

This goal of this ticket is to use data_structures/biteset.pxi for the bitset stuff in combinatorial polyhedron and cleanly seperate algorithm and data structure.

This ticket is mostly resolved in smaller tickets:

Changes are:

  1. Move functions optimizable by intrinsics to a seperate file (src/sage/data_structures/bitset_intrinsics.h). (See #27103 and #27122.)
  2. Allow most functions in bitset.pxi for fuzed types fuzed types, such that an (yet to be enriched) bitset data structure specialized for combinatorial faces can use it.
  3. Move bitset.pxi to bitset_base.pxd.
  4. Define a data structure face_s that gathers properties of a face and corresponding functions.
  5. Define a data structure face_list_s that gathers properties of a list of faces and corresponding functions.
  6. The two flavors of the algorithm (simple and standard) are set up as templates by abusing fuzed types. This makes the code much easier to read. Only the changes will be seperated in the code while the compiler compiles two seperate branches for both flavors.

The features of those changes are mainly:

Depends on #30529

CC: @jplab @LaisRast @tscrim

Component: geometry

Keywords: combinatorial polyhedron, bitsets

Author: Jonathan Kliem

Branch/Commit: 82a747d

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/30549

kliem commented 4 years ago
comment:1

It's a patch bomb, I know. If anyone knows a reasonable way around it, I'll do it.

Currently, this is working again and I'm happy. And it's so much easier to do some something from then on.

Most of the changes in the cython files are trivial (took me a while to figure out everything though).

The current branch is a bit of regression with non-simple-non-simplicial polyhedra. It can take much longer. One thing is the unnecessary union of coatoms. Also the branch prediction seems to be bad, because one could know much earlier that the coatom check is useless. I'll think about a solution that works well.

We waste a bit of memory by also allocating enough space for the coatoms in each case, but I wouldn't worry about it for now. In a worst case scenario this needs twice as much memory and much less in many scenarios that have less equal number of vertices/facets. Memory was never an issue for me, it was always time.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b7fee53factor out the algorithm variant with templates
fd5edebone more inline
4c2cc9falways compile all dependend files
349e105documentation
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from 1e4596e to 349e105

kliem commented 4 years ago
comment:3

I'm happy with the ticket as it is.

Using templates for the two algorithmic variants, the branch prediction has improved significantly. For extreme cases (see below), things got a bit slower. I don't know why.

In a future ticket one can spare allocating the memory for the coatoms in case we are not using the simple/simplicial algorithm. It doesn't seem to make a difference in runtime.

This is an example of slight slowdown:

sage: P = polytopes.permutahedron(6)                                                                                                                                  
sage: Q = P.base_extend(QQ).polar(in_affine_span=True)                                                                                                                
sage: R = P.join(Q)                                                                                                                                                   
sage: C = CombinatorialPolyhedron(R)                                                                                                                                  
sage: %time _ = C.f_vector()                                                                                                                                              
CPU times: user 1min 12s, sys: 28.8 ms, total: 1min 12s
Wall time: 1min 12s

This is 10 seconds faster on #30040.

I could also play around with constants etc.

This could also all be because the compiler ignores my inline instructions in places.

Anyway, I can live with the slight slowdown for now, because the structure is so much better for further improvements.

Any suggestions on how to improve things are welcome.

kliem commented 4 years ago
comment:4

https://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html

If this is better, I could move almost everything back to cython again.

The only thing that really needs to be in C/C++ are those 4 functions that I want to optimize with intrinsics later on (there are going to be a few more, but they are all tiny). To my knowledge you cannot do something like

#if __AVX__
def foo(): return 1
#elif __SSE4_1__
def foo(): return 2
#else
def foo(): return 3
#endif

in cython.

However I don't know how the performance is, when you include critical C/C++ function via cdef extern from that will be called a huge number of times.

kliem commented 4 years ago
comment:5

Also I'm happy with it now, the current status would force to move quite a lot of bitsets to C++. If I want to do it (which I'm not sure of yet) I would need to ask other people about it.

So for now this is a design proposal.

kliem commented 4 years ago

Changed commit from 349e105 to 74230a1

kliem commented 4 years ago

Changed branch from u/gh-kliem/face_structure to u/gh-kliem/use_face_structure

kliem commented 4 years ago
comment:6

I like it much better now.

Timings with this ticket (branch u/gh-kliem/use_face_structure):

(Note that I didn't apply new tricks to the algorithm. Only the data structure has changed.)

sage: P = polytopes.permutahedron(8, backend='normaliz')                                                                                                                            
sage: C = CombinatorialPolyhedron(P)                                                                                                                                                
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 648 ms, sys: 15 µs, total: 648 ms
Wall time: 648 ms

sage: P = polytopes.associahedron(['A',10], backend='normaliz')                                                                                                                     
sage: C = CombinatorialPolyhedron(P)                                                                                                                                                
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 1.97 s, sys: 0 ns, total: 1.97 s
Wall time: 1.97 s

sage: P = polytopes.permutahedron(7)                                                                                                                                                
sage: P1 = P.stack(next(P.face_generator(1)))                                                                                                                                       
sage: C = CombinatorialPolyhedron(P1)                                                                                                                                               
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 242 ms, sys: 0 ns, total: 242 ms
Wall time: 241 ms

sage: P = polytopes.hypercube(12)                                                                                                                                                   
sage: P1 = P.stack(next(P.face_generator(1)))                                                                                                                                       
sage: C = CombinatorialPolyhedron(P1)                                                                                                                                               
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 430 ms, sys: 0 ns, total: 430 ms
Wall time: 429 ms

sage: P = polytopes.permutahedron(6)                                                                                                                                                
sage: Q = P.base_extend(QQ).polar(in_affine_span=True)                                                                                                                              
sage: R = P.join(Q)                                                                                                                                                                 
sage: C = CombinatorialPolyhedron(R)                                                                                                                                                
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 1min 11s, sys: 31.8 ms, total: 1min 11s
Wall time: 1min 11s

Timings on #30040:

sage: P = polytopes.permutahedron(8, backend='normaliz')                                                                                                                            
sage: C = CombinatorialPolyhedron(P)                                                                                                                                                
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 680 ms, sys: 0 ns, total: 680 ms
Wall time: 679 ms

sage: P = polytopes.associahedron(['A',10], backend='normaliz')                                                                                                                     
sage: C = CombinatorialPolyhedron(P)                                                                                                                                                
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 2.2 s, sys: 45 µs, total: 2.2 s
Wall time: 2.2 s

sage: P = polytopes.permutahedron(7)                                                                                                                                                
sage: P1 = P.stack(next(P.face_generator(1)))                                                                                                                                       
sage: C = CombinatorialPolyhedron(P1)                                                                                                                                               
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 240 ms, sys: 0 ns, total: 240 ms
Wall time: 239 ms

sage: P = polytopes.hypercube(12)                                                                                                                                                   
sage: P1 = P.stack(next(P.face_generator(1)))                                                                                                                                       
sage: C = CombinatorialPolyhedron(P1)                                                                                                                                               
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 414 ms, sys: 0 ns, total: 414 ms
Wall time: 413 ms

sage: P = polytopes.permutahedron(6)                                                                                                                                                
sage: Q = P.base_extend(QQ).polar(in_affine_span=True)                                                                                                                              
sage: R = P.join(Q)                                                                                                                                                                 
sage: C = CombinatorialPolyhedron(R)                                                                                                                                                
sage: %time _ = C.f_vector()                                                                                                                                                        
CPU times: user 1min 4s, sys: 51.6 ms, total: 1min 4s
Wall time: 1min 4s

So I would say that there is no performance-wise reason to reject the changes.

The branch indicates a bit, how this could be split into seperate tickets that are doable to review.


Last 10 new commits:

bb49546face_list_t
bc8f985add find and sort
43ee71dfixes to list_of_faces.pxi
31544ffmerge u/gh-kliem/no_more_basic_access
d42e5abremove new declarations for now
7a32286removed deprecated functions
af27e02remove doctests that rely on implementation details
fe30622Merge branch 'u/gh-kliem/prepare_conversions_for_face_structure' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure_2
df3a254delete old files
74230a1use face_list_t and face_t for combinatorial polyhedron
kliem commented 4 years ago

Description changed:

--- 
+++ 
@@ -1,13 +1,23 @@
-This ticket removes all the top level bitset access. The algorithm is cleanly separated from the data structure now.
+This goal of this ticket is to use `data_structures/biteset.pxi` for the bitset stuff in combinatorial polyhedron and cleanly seperate algorithm and data structure.

-There is a new file `bitsets.cpp` (already introduced in #30528). This file should eventually be merged with `data_structures/bitsets.pxd` (and now one can see that such a thing is possible).
+This ticket should eventually be resolved in many smaller tickets.

-There is a file `face.cpp` that grabs what seems to be useful to define a face.
+Changes are:

-There is `face_list.cpp`. Here a list of faces is defined as a structure and some common methods that make sense for lists of faces are defined (such as intersect a face with a list of other faces and `get_next_level`).
+1. Move functions optimizable by intrinsics to a seperate file (`  src/sage/data_structures/bitset_intrinsics.h`). (See #27103 and #27122.)

-There are two new structures: `face_list_struct` and `face_struct`. `face_list_struct` is more or less public. `face_struct` is pretty much private. Only the provided functions should be used with it.
+2. Allow most functions in `bitset.pxi` for fuzed types [fuzed types](https://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html,), such that an (yet to be enriched) bitset data structure specialized for combinatorial faces can use it.

-The signatures of functions/methods became a lot easier. Cython files do not have to worry about alignment or `face_length` etc. Memory allocation is still done by `MemoryAllocator`. The low level file iteratively requests all the memory to define a face.
+3. Define a data structure `face_s` that gathers properties of a face and corresponding functions.

-With this ticket, one can easily adapt the data structure for combinatorial faces to almost anything one likes. Hopefully without changing the cython files.
+4. Define a data structure `face_list_s` that gathers properties of a list of faces and corresponding functions.
+
+5. The two flavors of the algorithm (simple and standard) are set up as templates by abusing fuzed types. This makes the code much easier to read. Only the changes will be seperated in the code while the compiler compiles two seperate branches for both flavors.
+
+The features of those changes are mainly:
+
+- Less involved signatures of functions.
+- Removing duplications of code.
+- New flavors of the algorithm can be implemented without messing up all the files.
+- We may enrich `face_s` without changing all our code.
+- Intrinsics can easily applied to bitsets in general.
kliem commented 4 years ago

Changed dependencies from #30528, #30529 to #30528, #30529, #30572

kliem commented 4 years ago
comment:9

I posted on sage devel on whether the proposed design is acceptable: https://groups.google.com/g/sage-devel/c/jD9BY5Ozlko

tscrim commented 4 years ago
comment:10

I am essentially happy with this ticket, but I want to wait and see what happens with the design discussion before going forward.

kliem commented 4 years ago

Changed dependencies from #30528, #30529, #30572 to #30528, #30529, #30572, #30571

kliem commented 4 years ago

Changed dependencies from #30528, #30529, #30572, #30571 to #30528, #30529, #30571, #30599

kliem commented 4 years ago

Description changed:

--- 
+++ 
@@ -1,18 +1,20 @@
 This goal of this ticket is to use `data_structures/biteset.pxi` for the bitset stuff in combinatorial polyhedron and cleanly seperate algorithm and data structure.

-This ticket should eventually be resolved in many smaller tickets.
+This ticket is mostly resolved in smaller tickets:
+- #30596: Outsource functions in bitset.pxi that can be optimized by intrinsics 
+- #30597: Define a sparse bitset structure 
+- #30601: Move bitset.pxi to bitset_base.pxd
+- #30598: Define a new data structure for a combinatorial face 
+- #30599: Define a new data structure for a list of combinatorial faces   - 

 Changes are:

 1. Move functions optimizable by intrinsics to a seperate file (`  src/sage/data_structures/bitset_intrinsics.h`). (See #27103 and #27122.)
-
 2. Allow most functions in `bitset.pxi` for fuzed types [fuzed types](https://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html,), such that an (yet to be enriched) bitset data structure specialized for combinatorial faces can use it.
-
-3. Define a data structure `face_s` that gathers properties of a face and corresponding functions.
-
-4. Define a data structure `face_list_s` that gathers properties of a list of faces and corresponding functions.
-
-5. The two flavors of the algorithm (simple and standard) are set up as templates by abusing fuzed types. This makes the code much easier to read. Only the changes will be seperated in the code while the compiler compiles two seperate branches for both flavors.
+3. Move `bitset.pxi` to `bitset_base.pxd`.
+4. Define a data structure `face_s` that gathers properties of a face and corresponding functions.
+5. Define a data structure `face_list_s` that gathers properties of a list of faces and corresponding functions.
+6. The two flavors of the algorithm (simple and standard) are set up as templates by abusing fuzed types. This makes the code much easier to read. Only the changes will be seperated in the code while the compiler compiles two seperate branches for both flavors.

 The features of those changes are mainly:
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from 74230a1 to f3e851f

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

02eabffadd find and sort
50317b0fixes to list_of_faces.pxi
32cf17eremove pxi file
c81f438move not inlined functions to pyx file
0a3df96Merge branch 'u/gh-kliem/no_more_basic_access' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure_2
5b21614use reference so simplify code
85708b0Merge branch 'u/gh-kliem/simplify_face_struct_pointer' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure_2
2f75805Merge branch 'u/gh-kliem/prepare_conversions_for_face_structure' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure_2
605fc6bdelete old files
f3e851fuse face_list_t and face_t for combinatorial polyhedron
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from f3e851f to 298c2f7

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

7a40b9aredundant exit; removed semicolon
28bfa02Merge branch 'u/gh-kliem/data_structure_for_face_list' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure
298c2f7Merge branch 'develop' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure
kliem commented 3 years ago
comment:17

Red branch.

I'll fix it, once all the dependencies are in.

kliem commented 3 years ago

Changed dependencies from #30528, #30529, #30571, #30599 to #30529, #30599

kliem commented 3 years ago

Changed dependencies from #30529, #30599 to #30529

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 298c2f7 to 82a747d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

82a747dMerge branch 'u/gh-kliem/use_face_structure' of git://trac.sagemath.org/sage into u/gh-kliem/use_face_structure_new
kliem commented 3 years ago
comment:21

Ok, this is back on needs review. Almost all dependencies are gone now.

tscrim commented 3 years ago
comment:22

# The universe. <-- Great comment.

LGTM. Hopefully this won't have any issues with other systems that some of the other tickets had.

tscrim commented 3 years ago

Reviewer: Travis Scrimshaw

kliem commented 3 years ago
comment:23

Thank you.

Yes, I think it is way more stable, as it just exposes bitsets, which has been working for a while now.

I just didn't think, the intermediate solutions to the end.

vbraun commented 3 years ago

Changed branch from u/gh-kliem/use_face_structure to 82a747d