Implement the class CombinatorialPolyhedron

kliem commented 5 years ago

We construct a class CombinatorialPolyhedron, which collects several methods for polyhedra depending only on the combinatorial type. Actually, most of the methods will work for atomic, coatomic Eulerian lattices.

Computations can be much faster when avoiding creating explicitly the face lattice.

Example:

P = polytopes.permutahedron(6)
P.f_vector()

On my machine this takes over 8s, but it should be a matter of ms. For permutahedron(7) it takes forever and should also be a matter of ms.

Additionally, one can create the face_lattice much faster. The face_lattice of permutahedron(7) should only take few seconds. This is done by creating of lists of all faces and then quickly checking inclusions.

The crucial speed up is obtained by creating bit-vectors for the faces, each bit corresponding to a vertex. Intersection of faces is then only bitwise-and and subset check is (A & ~B == 0). Both steps only take one CPU step per 64/128/256 bits/vertices (depending on the processor). Otherwise the algorithm is pretty similar to current algorithm to create the hasse_diagram, except that it avoids double counting (and hence can calculate the f_vector) without an explicit list of all faces.

The ticket will prepare, but not use, SIMD-instructions. Enabling them will be a subject of #27103.

FOLLOW-UPS:

implement calculation of entries in the flag-vector, this might be better, than the calculation in the face_lattice
Improve the calculation of properties of polytopes that depend on the face-lattice (f-vector,edges, ridges, k-faces in general, edge-graph, etc.)
Enable SIMD-instructions.

CC: @stumpc5 @jplab @tscrim

Component: geometry

Keywords: polytopes, FindPoly, f_vector, face lattice, edge graph, ridge graph

Author: Jonathan Kliem

Branch: 39a7099

Reviewer: Jeroen Demeyer, Travis Scrimshaw, Vincent Delecroix

Issue created by migration from https://trac.sagemath.org/ticket/26887

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a89c37b More work with trivial Polyehedra, they should work now

fchapoton commented 5 years ago

comment:38

++ EXAMPLE::

should be

++ EXAMPLES::

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a89c37b to 0eedbb9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0eedbb9 fixed an issue with unbounded Polyhedra and CombinatorialPolyhedron.faces(0), also fixed Halfspaces

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 0eedbb9 to b37feb7

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b37feb7 all tests passed, however there are still examples missing starting at the function faces

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from b37feb7 to 829ec2a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`d8223e7`	`Added a roughly equivalent Algorithm to f_vector, more examples`
`829ec2a`	`Documentation in base.pyx done`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 829ec2a to b513eb2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`70b5a7a`	`removed all examples at beginning of file`
`c339851`	`added a face_iterator`
`b513eb2`	`CombinatorialPolyhedron inherits from SageObject now`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from b513eb2 to a176af1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`4b7948f`	added a `__reduce__` function to pickle/unpickle correctly,
`a176af1`	`added copyright license in hasse_diagram.cc`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a176af1 to a2f90ad

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a2f90ad corrected edges and ridges a polyhedron with two vertices

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from a2f90ad to 46f9555

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

46f9555 added a macro to ensure aligned_alloc exists,

jhpalmieri commented 5 years ago

comment:47

Still doesn't build on OS X:

gcc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wno-unused -I./sage/geometry/polyhedron/combinatorial_polyhedron -I/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include -I/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7 -I/Users/palmieri/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/numpy/core/include -I/Users/palmieri/Desktop/Sage_stuff/git/sage/src -I/Users/palmieri/Desktop/Sage_stuff/git/sage/src/sage/ext -Ibuild/cythonized -I/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7 -c sage/geometry/polyhedron/combinatorial_polyhedron/hasse_diagram.cc -o build/temp.macosx-10.9-x86_64-2.7/sage/geometry/polyhedron/combinatorial_polyhedron/hasse_diagram.o -fno-strict-aliasing -DCYTHON_CLINE_IN_TRACEBACK=1 -O3 -march=native -std=c++11
In file included from sage/geometry/polyhedron/combinatorial_polyhedron/hasse_diagram.cc:13:
In file included from /Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/Python.h:88:
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/unicodeobject.h:534:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register PyObject *obj,     /* Object */
    ^~~~~~~~~
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/unicodeobject.h:553:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register PyObject *obj      /* Object */
    ^~~~~~~~~
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/unicodeobject.h:575:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register const wchar_t *w,  /* wchar_t buffer */
    ^~~~~~~~~
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/unicodeobject.h:593:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register wchar_t *w,        /* wchar_t buffer */
    ^~~~~~~~~
In file included from sage/geometry/polyhedron/combinatorial_polyhedron/hasse_diagram.cc:13:
In file included from /Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/Python.h:97:
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/stringobject.h:173:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register PyObject *obj,     /* string or Unicode object */
    ^~~~~~~~~
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/stringobject.h:174:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register char **s,          /* pointer to buffer variable */
    ^~~~~~~~~
/Users/palmieri/Desktop/Sage_stuff/git/sage/local/include/python2.7/stringobject.h:175:5: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
    register Py_ssize_t *len    /* pointer to length variable or NULL
    ^~~~~~~~~
sage/geometry/polyhedron/combinatorial_polyhedron/hasse_diagram.cc:717:22: error: variable-sized object may not be initialized
    int addfacearray[constlenfaces] = { };
                     ^~~~~~~~~~~~~
7 warnings and 1 error generated.

Cython would be so much easier.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`6ae3dd4`	`Fixed issue with PopCount, PopCount requires header immintin, which was only included when AVX2 was availabe`
`05635c9`	`replace iterator.next() by next(iterator) in test for Python 3`
`1971dd7`	`removed geometry/all.py from ticket, removed geometry/polyhedron/base.py from ticket`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 46f9555 to 1971dd7

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 1971dd7 to de57d7f

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`ae0d131`	`removed commend in module-list.py, march=native is needed`
`f27a2ef`	`removed Python.h`
`de57d7f`	`fixed headers, removed Python.h, moved declaration of constlenfaces to the beginning of function`

kliem commented 5 years ago

comment:50

Should be fixed.

I don't know why I didn't remove Python.h, it sure wasn't needed anymore. (Except that it included stdint). However, the pure inclusion of the Python module in C should not give an error.

Replying to @jhpalmieri:

Still doesn't build on OS X:

Cython would be so much easier.

For me, Cython was not easier. I couldn't get the things done the way I wanted. Once you know what should be done in C exactly, then it is probably easier. Now that I have written the code in C++, I could probably do it in Cython as well. When I searched for specific questions in Cython, I just couldn't get any result, but in C or C++ there are tons of helps on the web that helped me. I am a complete newbie in programming. It feels to me that I aquired at least half of my knowledge through this project. From Cython to C/C++ I had a speedup of about 20 on large Polytopes, probably just because translating it to pure C/C++ helped me better understand what the machine is doing and can be doing.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from de57d7f to 882ce71

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

882ce71 Interchanged long tests

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 882ce71 to f5c87d8

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f5c87d8 I missed a polytopes.Polyhedron(7), this was taken out now

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`ee86475`	`this should work on 32bit now, I replaced uint64_t by a macro, which first determines wether we are 64 or 32bi,`
`de03b8b`	added doctest to `_repr_` and _record_all_faces

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from f5c87d8 to de03b8b

kliem commented 5 years ago

Description changed:

--- 
+++ 
@@ -1,17 +1,23 @@
-Several improvements for calculating properties of polytopes that depend on the face-lattice.
-
-We will construct a class CombinatorialPolytope, which will collect several methods for polyhedra depending only on the combinatorial type. Actually, most of the methods will work for atomic, coatimic Eulerian lattices.
+We construct a class CombinatorialPolyhedron, which collects several methods for polyhedra depending only on the combinatorial type. Actually, most of the methods will work for atomic, coatomic Eulerian lattices.

 Computations can be much faster when avoiding creating explicitly the face lattice.

 Example:
+
+```
 P = polytopes.permutahedron(6)
 P.f_vector()
+```

-At my machine this takes over 8s, but it should be a matter of ms. For permutahedron(7) it takes forever and should also be a matter of ms.
+On my machine this takes over 8s, but it should be a matter of ms. For `permutahedron(7)` it takes forever and should also be a matter of ms.

-Additionally, one can create the face_lattice much faster. The face_lattice of permutahedron(7) should only take few seconds. This is done by creating of lists of all faces and then quickly checking inclusions.
+Additionally, one can create the face_lattice much faster. The face_lattice of `permutahedron(7)` should only take few seconds. This is done by creating of lists of all faces and then quickly checking inclusions.

-The crucial speed up is obtained by creating bit-vectors for the faces, each bit corresponding to a vertex. Intersection of faces is then only bitwise-and and subset check is (A & ~B == 0). Both steps only take one CPU step per 64/128/256 bits/vertices (depending on the processor). Otherwise the algorithm is pretty similar to current algorithm to create the hasse_diagram, except that it avoids double counting (and hence can calculate the f_vector) without an explicit list of all faces.
+The crucial speed up is obtained by creating bit-vectors for the faces, each bit corresponding to a vertex. Intersection of faces is then only bitwise-and and subset check is (A & ~B == 0). Both steps only take one CPU step per 64/128/256 bits/vertices (depending on the processor). Otherwise the algorithm is pretty similar to current algorithm to create the `hasse_diagram`, except that it avoids double counting (and hence can calculate the `f_vector`) without an explicit list of all faces.

-It might also make sense to improve calculation of entries in the flag-vector, while we are at it. This takes for every entry in the flag-vector less time then to construct the face-lattice.
+
+FOLLOW-UPS:
+
+- improve calculation of entries in the flag-vector, while we are at it.
+  This takes for every entry in the flag-vector less time then to construct the face-lattice.
+- Improve the calculation of properties of polytopes that depend on the face-lattice (f-vector,edges, ridges, k-faces in general, edge-graph, etc.)

jdemeyer commented 5 years ago

comment:55

Some purely technical points about the programming:

We generally discourage putting Extension parameters in module_list.py. Better use # distutils declarations in Cython files. See https://cython.readthedocs.io/en/latest/src/userguide/source_files_and_compilation.html#source-files-and-compilation for examples.
-march=native is not acceptable since we want to distribute binaries. Moreover, it makes no sense to compile just this specific file with -march=native. If you want to add -march=native, it should probably be added as compilation flag for all packages (or at least for Python, such that it applies to all Python packages). If you want to go through with this, you should open a new ticket for that. Also note that the option -march=native itself is not portable: there be a check that the compiler actually understands it.
The -O3 option is the default, so that's redundant.
You have very long lines in your sources. You should try to wrap them to less than 80 characters per line.
You're adding a lot of C/Cython code without a single sig_on/sig_check. That's suspicious as such code cannot be interrupted. See https://cysignals.readthedocs.io/en/latest/interrupt.html#interrupt-handling
While you're at it, you could also use the cysignals memory allocation functions instead of PyMem_....
hasse_diagram.cc contains various trivial functions which could easily be replaced by Cython. For example, the various accessor functions could be replaced by just accessing the attributes in Cython (this requires that the class CombinatorialPolyhedron is actually declared in Cython). Also delete_CombinatorialPolyhedron(C) could be replaced by just del C.
This is just advice: it's also possible to just include the .cc file in the .pyx file using cdef extern from "hasse_diagram.cc". That way, you can avoid the .h file and you give more optimization opportunities to the compiler, since everything is compiled in one C++ file.
I'm very worried about portability: you seem to be using a lot of compiler-specific and CPU-specific code. I don't know whether anything will actually break, but there is a substantial risk.
Typo: memomory -> memory

jdemeyer commented 5 years ago

comment:56

Replying to @kliem:

In Cython I was able to calculate this vector (or more important the edge graph) in something like 3 hours, in C it took me 20 minutes.

Just to add to what Vincent said: if your Cython code is substantially slower than your C/C++ code, you're doing it wrong. When using Cython properly, it should have the same speed as pure C/C++.

That being said, I'm not saying that you should use Cython instead of C++, just that you could.

jdemeyer commented 5 years ago

comment:57

Another minor comment: uint64_t_or_uint32_t is a very clunky type name. Couldn't you use a more descriptive name? Or use a standard type like unsigned long or size_t?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`7904eab`	`Lots of comments for hasse_diagram.pxd as on how to use the functions`
`2df5d2e`	`some remark for later, edges of an unbounded polyhedron needs to figured out yet`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from de03b8b to 2df5d2e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0a95629 corrected multiline comment

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 2df5d2e to 0a95629

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

96c0a35 Removed Compile arguments from module_list,

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 0a95629 to 96c0a35

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Changed commit from 96c0a35 to ae08e25

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 5 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ae08e25 corrected memory

kliem commented 5 years ago

comment:62

Replying to @jdemeyer:

Another minor comment: uint64_t_or_uint32_t is a very clunky type name. Couldn't you use a more descriptive name? Or use a standard type like unsigned long or size_t?

Any suggestions? Maybe 'uint64/32_t'? I don't like the name either...

Also, on 64bit machines, I really need to know that this is uint64_t. On 32-bit machines that it is uint32_t. As far as I understand size_t can be anything. Also unsigned long should usually be uint64_t, but there is no guarantee on it. Is there?

videlec commented 5 years ago

comment:63

Replying to @kliem:

Replying to @jdemeyer:

Another minor comment: uint64_t_or_uint32_t is a very clunky type name. Couldn't you use a more descriptive name? Or use a standard type like unsigned long or size_t?

Any suggestions? Maybe 'uint64/32_t'? I don't like the name either...

Also, on 64bit machines, I really need to know that this is uint64_t. On 32-bit machines that it is uint32_t. As far as I understand size_t can be anything. Also unsigned long should usually be uint64_t, but there is no guarantee on it. Is there?

There is no guaranty at all (e.g. look at the header of gmp where they define mp_limb_t). I don't understand what is the problem with uint64_t and uint32_t... These are supposed to be standard and portable. Contrarily to unsigned long or size_t. Though I wonder why gmp avoid using them...

kliem commented 5 years ago

comment:64

Thanks for the comments and help.

Replying to @jdemeyer:

Some purely technical points about the programming:

We generally discourage putting Extension parameters in module_list.py. Better use # distutils declarations in Cython files. See https://cython.readthedocs.io/en/latest/src/userguide/source_files_and_compilation.html#source-files-and-compilation for examples.

Done. Once I knew what to look for, I found a good example (combinat/partitions.pyx).

-march=native is not acceptable since we want to distribute binaries. Moreover, it makes no sense to compile just this specific file with -march=native. If you want to add -march=native, it should probably be added as compilation flag for all packages (or at least for Python, such that it applies to all Python packages). If you want to go through with this, you should open a new ticket for that. Also note that the option -march=native itself is not portable: there be a check that the compiler actually understands it.

It took be a bit to figure this one out. A solution would be to use function multiversioning as available in GCC >= 6 (https://hannes.hauswedell.net/post/2017/12/09/fmv/).

If compiled by GCC >= 6 there would be multiple versions of the code available. At runtime the decision is made according to availability of the intrinsics. This way a build be used by different machines (e.g. Network Versions of it or binaries).

At compile time we can then decide for 64 or 32 bits and check wether it is compiled with GCC >= 6.

The -O3 option is the default, so that's redundant.

Done.

You have very long lines in your sources. You should try to wrap them to less than 80 characters per line.

I will take care of that.

You're adding a lot of C/Cython code without a single sig_on/sig_check. That's suspicious as such code cannot be interrupted. See https://cysignals.readthedocs.io/en/latest/interrupt.html#interrupt-handling

combinat/partitions.pyx seems to be a good example. There this is done as well. I noticed this problem. When doing long calculations I had to kill sage the hard way, when I wanted to do so.

While you're at it, you could also use the cysignals memory allocation functions instead of PyMem_....

hasse_diagram.cc contains various trivial functions which could easily be replaced by Cython. For example, the various accessor functions could be replaced by just accessing the attributes in Cython (this requires that the class CombinatorialPolyhedron is actually declared in Cython). Also delete_CombinatorialPolyhedron(C) could be replaced by just del C.

I honestly don't know how to do this. At the moment I am thinking about actually importing the class CombinatorialPolyhedron in C++ a number of times with different macros. This way I avoid the multi-version function being decided on a low level, which would slow things done a lot and would break in-lines etc. I could define the functions in the class a number of times, but this would make it even less readable.

Anyway, I'm still thinking about it and trying to figure out what to do.

This is just advice: it's also possible to just include the .cc file in the .pyx file using cdef extern from "hasse_diagram.cc". That way, you can avoid the .h file and you give more optimization opportunities to the compiler, since everything is compiled in one C++ file.

Great, I like this much better. For the extension thing, one was supposed to use a header. But now I'm able to work without it. I always found it a pain to work with header and cc.

I'm very worried about portability: you seem to be using a lot of compiler-specific and CPU-specific code. I don't know whether anything will actually break, but there is a substantial risk.

See above. GCC sets up multiple functions and at runtime the correct function is being called. If one does not compile with GCC >= 6 then I suggest not using intrinsics at all.

There are not too many scenarios:

AVX2 (maybe delete this, if there are no substantial losses doing it with AVX-only, however the instructions are easier with AVX2.)
AVX, but not AVX2
popcnt, but not AVX
SSE4.1
default

This is 64-bit. I have to check out the scenarios for 32-bit yet.

Typo: memomory -> memory

Done.

kliem commented 5 years ago

comment:65

Replying to @videlec:

Replying to @kliem:

Replying to @jdemeyer:

Another minor comment: uint64_t_or_uint32_t is a very clunky type name. Couldn't you use a more descriptive name? Or use a standard type like unsigned long or size_t?

Any suggestions? Maybe 'uint64/32_t'? I don't like the name either...

Also, on 64bit machines, I really need to know that this is uint64_t. On 32-bit machines that it is uint32_t. As far as I understand size_t can be anything. Also unsigned long should usually be uint64_t, but there is no guarantee on it. Is there?

There is no guaranty at all (e.g. look at the header of gmp where they define mp_limb_t). I don't understand what is the problem with uint64_t and uint32_t... These are supposed to be standard and portable. Contrarily to unsigned long or size_t. Though I wonder why gmp avoid using them...

I wasn't sure whether uint64_t is always available on 32-bit machines. If this is the case, I can of course use it. In order to do popcount I need to do a pointer typecast, but that should be fine and the only problem then. Some patchbot gave me an error message about _mm_popcnt_u64 not being defined, which seems to be the case, if the machine is 32-bit.

jhpalmieri commented 5 years ago

comment:66

You probably know this, but just in case: on OS X, clang is used, not gcc.

jdemeyer commented 5 years ago

comment:67

Replying to @kliem:

Also, on 64bit machines, I really need to know that this is uint64_t.

What's your definition of a 64-bit machine (or 32-bit machine)?

jdemeyer commented 5 years ago

comment:68

A more or less equivalent question: why do you really need to know that the type is uint64_t on a 64-bit machine and uint32_t on a 32-bit machine?

jdemeyer commented 5 years ago

comment:69

Replying to @kliem:

A solution would be to use function multiversioning as available in GCC >= 6 (https://hannes.hauswedell.net/post/2017/12/09/fmv/).

That's overthinking it for a generic math package like Sage. Almost all casual Sage users are not going to care how fast this code really runs.

But you probably misunderstood me: I didn't mean that -march=native was a bad idea, just that it should be enabled more carefully. And also not for this one specific file only, that would be just strange.

kliem commented 5 years ago

comment:70

Replying to @jdemeyer:

A more or less equivalent question: why do you really to know that the type is uint64_t on a 64-bit machine and uint32_t on a 32-bit machine?

#if INTPTR_MAX == INT64_MAX

This is what I read somewhere. If this is for some reason is not set at all, it uses uint32_t and more important it will not call _mm_popcnt_u64, which does not seem to be defined for 32bit machines, even if the processor has POPCNT. I figured this is why the following patchbot run failed:

https://patchbot.sagemath.org/log/26887/Ubuntu/14.04/i686/3.13.0-164-generic/arando/2019-01-22%2009:12:47?short

kliem commented 5 years ago

comment:71

Replying to @jdemeyer:

Replying to @kliem:

A solution would be to use function multiversioning as available in GCC >= 6 (https://hannes.hauswedell.net/post/2017/12/09/fmv/).

That's overthinking it for a generic math package like Sage. Almost all casual Sage users are not going to care how fast this code really runs.

Probably. The more important question is whether something can be calculated or not. If one then wants to generate e.g. the face_lattice from the CombinatorialPolyhedron, initalizing this is much slower than anything in this module.

But you probably misunderstood me: I didn't mean that -march=native was a bad idea, just that it should be enabled more carefully. And also not for this one specific file only, that would be just strange.

So what you are suggesting is the following(?): Leave it as it is and delete -march=native. This should then somehow be added to Sage's build process (as an option?), so that one can build a more effective version, if one wants to. The pre-build binaries might then take 4-times as long, but are stable.

This would then also be a portable way of using intrinsics, such that when in any other module it seems helpful, one can just copy-paste from this code.

sagemath / sage

Implement the class CombinatorialPolyhedron #26887