sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.36k stars 462 forks source link

Groebner basis computations with openf4 package #18749

Open 7b29467c-ebd0-42a5-86cb-a291e9479b22 opened 9 years ago

7b29467c-ebd0-42a5-86cb-a291e9479b22 commented 9 years ago

We propose a new C++ library to compute Groebner basis over finite fields with the F4 algorithm. This library works on prime fields of characteristic < 232 and on binary field extensions of degree < 64.

The source code and a documentation are available on http://nauotit.github.io/openf4/

The tarball is available on http://nauotit.github.io/openf4/openf4-1.0.0.tar.gz

The branch attached to this ticket contains a wrapper (Cython) for this library. Place the openf4 archive in the upstream directory, then use sage -i openf4 before using this branch.

Informations on the new package:

-Speed (on my computer without vectorisation):

cyclic 8, GF(251): sage : 45.7 s f4 : 30.8 s

katsura 12, GF(251): sage : 12min 36s f4 : 3min 16s

cyclic 8, GF(65521): sage : 58.4 f4 : 25s

katsura 12, GF(65521): sage : 15min 47s f4 : 2min 46s

cyclic 8, GF(4294967291) : sage : does not work f4 : 55.5 s

katsura 12, GF(4294967291) : sage : does not work f4 : 11min 50s

Ideal in 8 variables over GF(215): sage : 60 s f4 : 10s

Ideal in 8 variables over GF(231): sage : 58.1 s f4 : 15s

Ideal in 8 variables over GF(263):
sage : 1min 18s f4 : 30s

Memory requirement:

The memory needed by openf4 depends on:

Even if our matrices are smaller than the ones of Magma, on cyclic 9 they exceed 70000 x 70000, which requires already around 20 GB on 4 Bytes integers.

Documentation:

Available with the package (doxygen). And in the updated file multi_polynomial_ideal.py, function groebner_basis.

Usability:

Can be used in the same way than other algorithms. However only the grevlex order is available. Detailed in the file multi_polynomial_ideal.py, function groebner_basis.

Absence of memory leaks:

Tested with valgrind.

Maintainable:

When the Givaro version of Sage will be updated, the package could be build with this dependency in order to handle the prime fields of big characteristics.

Portability:

In order to be efficient our package can use the SSE2 and SSE4 extensions (vectorisation). This extensions are automatically detected at build time (configure). However, even without these optimisations our package is more efficient than the current implemented algorithms in Sage. The package was tested on:

Reasonable build time, size, dependencies:

If the build time is a problem, examples can be removed from the makefile.am. Size: 12Mo Dependencies: Givaro is an optional dependency but a recent version is required.

Note:

The proposed wrapper in Cython is not very efficient and may be improved: for instance, the computation of katsura 12 over GF(4294967291) takes only 2min 11s, and more than 9 min are spent in the cython wrapper to convert the result into sage polynomials.

CC: @nathanncohen @simon-king-jena @jpflori @zimmermann6

Component: packages: optional

Keywords: F4, groebner basis, ideal, sd87

Work Issues: default algorithm choice

Author: Titouan Coladon

Branch/Commit: u/jdemeyer/f4 @ 0da1853

Reviewer: Martin Albrecht, Travis Scrimshaw, Jeroen Demeyer, Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/18749

malb commented 9 years ago
comment:1

This looks very interesting.

Can you clarify what you're trying to achieve with this ticket. Do you want your F4 code to be a standard or an optional package? Note that adding standard packages requires a vote on [sage-devel] and usually some time during which the package was an optional package.

Does your package have a name other than F4?

Also, do you have an idea why the performance of the wrapper is so poor?

tscrim commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,5 +1,5 @@
 We propose a new C++ library to compute Groebner basis over finite fields with the F4 algorithm.
-This library works on prime fields of characteristic < 2^32 
+This library works on prime fields of characteristic < 2<sup>32</sup>
 and on binary field extensions of degree < 64.

 The source code and a documentation are available on 
@@ -39,15 +39,15 @@
 sage : does not work
 f4 : 11min 50s

-Ideal in 8 variables over GF(2^15):
+Ideal in 8 variables over GF(2<sup>15</sup>):
 sage : 60 s
 f4 : 10s

-Ideal in 8 variables over GF(2^31): 
+Ideal in 8 variables over GF(2<sup>31</sup>): 
 sage : 58.1 s
 f4 : 15s

-Ideal in 8 variables over GF(2^63):  
+Ideal in 8 variables over GF(2<sup>63</sup>):  
 sage : 1min 18s
 f4 : 30s
tscrim commented 9 years ago
comment:2

With the latest beta versions of Sage, you'll need to add a file type which contains the word "optional" (as new Sage packages start as optional, but if you want to make this standard, you can make a request on Sage-devel). You also need to make in module_list.pyx an optional extension.

I'm guessing the conversion has to do with taking the basis as a vector of strings and you need to convert it to a better python type. You probably want to do this in a cdef function and have the python level function just do the final conversions. Conversions to strings is typically the slowest way to do things. It might be best to have a python data structure in the bindings file more closer to your internal data structures.

williamstein commented 9 years ago
comment:4

I think they've put adding it as a standard package to a vote here: https://groups.google.com/forum/#!topic/sage-devel/CfZ8kzT8B2s

malb commented 9 years ago

Changed commit from a2e2002 to 75f896a

malb commented 9 years ago

Changed branch from u/tcoladon/f4 to u/malb/t18749_f4

malb commented 9 years ago
comment:5
# before
sage: P = PolynomialRing(GF(next_prime(2^31)), 8, 'x')
sage: I = sage.rings.ideal.Cyclic(P)
sage: from sage.rings.polynomial.groebner_basis_f4 import groebner_basis_f4
sage: %time gb0 = groebner_basis_f4(I)
6.4465777874 # <- conversion time
CPU times: user 14.8 s, sys: 400 ms, total: 15.2 s
Wall time: 15.2 s

# after
sage: P = PolynomialRing(GF(next_prime(2^31)), 8, 'x')  
sage: I = sage.rings.ideal.Cyclic(P)                  
sage: %time gb0 = groebner_basis_f4(I)                
3.16141104698 # <- conversion time
CPU times: user 11.8 s, sys: 144 ms, total: 12 s
Wall time: 11.9 s

Last 10 new commits:

013879bfix doctests & documentation
bfbb119obey line width
676b798rename modulo → modulus & raise error instead of printing to stddout
b324306test exception
75f8302specialised conversion from F4 string representation
0c76dbbdelete trailing whitespaces
10d8af5return converted basis not strings
24e2191Sage-ify interface to F4 more
48cf3c4mark more doctests optional - f4
75f896afix prot interface for F4
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ec6625bdon't print SAGE_INC during compilation
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 75f896a to ec6625b

vbraun commented 9 years ago
comment:7

The C++ code looks decent. Are you one of the upstream authors?

I didn't see a public source repo or bug tracker. If you don't want to host it yourself there is always github or bitbucket. But just putting a tarball on your homepage hasn't been best practice since the 90s ;-)

Are the headers installed to $prefix/include/f4/include/f4.h? Why the extra /include/?

dimpase commented 9 years ago
comment:8

SPKG.txt says that openmp is a dependency. But this is not mentioned above...

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bfca7d5Merge branch 'develop' into f4
cf1952aadd required type field for f4
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from ec6625b to cf1952a

jdemeyer commented 9 years ago

Replying to @sagetrac-tcoladon:

Place the f4 archive in the upstream directory, then use sage -sh sage-fix-pkg-checksums

This shouldn't be needed. The branch should contain the correct checksum!

jdemeyer commented 9 years ago
comment:12

It seems that the branch makes src/sage/rings/polynomial/multi_polynomial_ideal.py executable??

jdemeyer commented 9 years ago
comment:13

At the same time, the files spkg-install and spkg-check should be executable.

jdemeyer commented 9 years ago
comment:14

sage.rings.polynomial.groebner_basis_f4 should be an OptionalExtension.

malb commented 9 years ago
comment:15

I played around with this ticket some more. On a Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz with 16 real cores:

sage: P = PolynomialRing(GF(previous_prime(2^31)), 9, 'x')
sage: I = sage.rings.ideal.Cyclic(P)                                                                                                                                                                                        
sage: %time gb = I.groebner_basis('magma')                                                                                                                                                                                  
CPU times: user 19.1 s, sys: 904 ms, total: 20 s
Wall time: 5h 22min 33s

sage: I = sage.rings.ideal.Cyclic(P) # avoid caching
sage: %time gb16 = I.groebner_basis('f4', threads=16)
CPU times: user 1h 23min 13s, sys: 33.6 s, total: 1h 23min 47s
Wall time: 9min 54s

sage: gb == gb16
True

sage: %time gb8 = I.groebner_basis('f4', threads=8)                                                                                                                                                                        
CPU times: user 1h 4min 55s, sys: 27.9 s, total: 1h 5min 23s
Wall time: 12min 23s

sage: gb == gb8
True

sage: %time gb1 = I.groebner_basis('f4', threads=1)                                                                                                                                                                         
CPU times: user 50min 26s, sys: 26.6 s, total: 50min 52s
Wall time: 50min 56s

sage: gb1 == gb
True
williamstein commented 9 years ago
comment:16

I've been waiting 10 years for this!

malb commented 9 years ago
comment:17

The results over GF(28) seem to be incorrect, though:

sage: sr = mq.SR(1,1,1,4,gf2=False) # smallest scale small scale AES
sage: F,s = sr.polynomial_system()

sage: %time gb = F.groebner_basis()
CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 6.76 ms
sage: gb[0]
k003^2 + (a^2 + a + 1)*k003 + (a^3 + a^2 + 1)

sage: %time gbf4 = F.groebner_basis('f4')

GROEBNER BASIS : (1) # this output should not be here
---> 0 ms 

CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 10.9 ms
sage: gbf4 == gb # the output disagrees
False

sage: %time gbm = F.groebner_basis('magma') # checking who Magma agrees with
CPU times: user 68 ms, sys: 20 ms, total: 88 ms
Wall time: 997 ms
sage: gbm == gb
True
2edf2d52-6c17-4f2a-88a6-e860db4dcddf commented 9 years ago
comment:18

I made a few benchmarks, compared to giac, natively (conversions to sage take a significative amount of time at least for f4), with 1 thread, using a prime of size 24 bits (giac can go up to 31 bits, add about 20% time). For cyclic8, giac is a little faster (11s. vs 12) For katsura12, f4 is faster (45s vs 69s) For cyclic9: giac does it in about 15 minutes with 700M of RAM, I stopped the computation of f4 after about the same time because the memory used was reaching 23G (too much for my laptop). This is not specific to cyclic9, it also happens for katsura12.

frederichan-IMJPRG commented 9 years ago
comment:19

I think that the time of conversion from F4 and giac to sage is quite similar. I also noticed that F4 is faster but the server I am working on has 16Go of ram and I had to interrupt the computation with F4 of cyclic9 mod prev_prime(2^31).

on sage with giac and giacpy the syntax is like this:

sage: P = PolynomialRing(GF(previous_prime(2^31)), 9, 'x')
sage: I = sage.rings.ideal.Cyclic(P)
sage: from giacpy import *

sage: p=previous_prime(2^31)
sage: time gb8B = (libgiac(I.gens()) % p).gbasis([P.gens()])
Time: CPU 1687.80 s, Wall: 1687.55 s
sage: # the saved file is 84M
sage: time gb8B.savegen('/home/han/cyclic9modp')
Time: CPU 8.90 s, Wall: 8.92 s
sage: #converting from giac to sage
sage: time BG=Sequence(gb8B,P)
Time: CPU 30.97 s, Wall: 30.97 s
sage: time gb8Bbis=loadgiacgen('/home/han/cyclic9modp')
Time: CPU 7.89 s, Wall: 7.89 s
frederichan-IMJPRG commented 9 years ago
comment:20

Replying to @malb:

The results over GF(28) seem to be incorrect, though:

sage: sr = mq.SR(1,1,1,4,gf2=False) # smallest scale small scale AES
sage: F,s = sr.polynomial_system()

sage: %time gb = F.groebner_basis()
CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 6.76 ms
sage: gb[0]
k003^2 + (a^2 + a + 1)*k003 + (a^3 + a^2 + 1)

sage: %time gbf4 = F.groebner_basis('f4')

GROEBNER BASIS : (1) # this output should not be here
---> 0 ms 

CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 10.9 ms
sage: gbf4 == gb # the output disagrees
False

It is not clear for me that why gbf4 == gb should be True. I have nevertheless:

sage: gbf4==gb,gbf4.is_groebner(), gb.is_groebner()
(False, True, True)
7b29467c-ebd0-42a5-86cb-a291e9479b22 commented 9 years ago
comment:21

Thank you for the bug report and the wrapper improvement.

I have found the bug (due to SSE).

Does the name -f4lib -openf4 may be a suitable ?

For the "extra /include": I place in $prefix/include only libf4.h and f4.h, the "real" sources are placed in the $prefix/include/f4 directory where the declarations are separed from the definitions.

I will set up a git soon with the fix.

Let me know if you have any other requirements.

malb commented 9 years ago
comment:22

Replying to @sagetrac-tcoladon:

Does the name -f4lib -openf4 may be a suitable ?

I'd say anything which is a proper name should work. 'f4lib' doesn't really qualify by that criterion I'd say, but it's better than 'f4'; 'openf4' could do. It's really up to you, though.

For the "extra /include": I place in $prefix/include only libf4.h and f4.h, the "real" sources are placed in the $prefix/include/f4 directory where the declarations are separed from the definitions.

Why do you install the sources at all? As far as I can see you instantiate all templates in libf4.cpp anyway, so is there a need to copy the whole source code in local/include?

dimpase commented 9 years ago
comment:23

Replying to @malb:

Replying to @sagetrac-tcoladon:

Does the name -f4lib -openf4 may be a suitable ?

I'd say anything which is a proper name should work. 'f4lib' doesn't really qualify by that criterion I'd say, but it's better than 'f4'; 'openf4' could do. It's really up to you, though.

For the "extra /include": I place in $prefix/include only libf4.h and f4.h, the "real" sources are placed in the $prefix/include/f4 directory where the declarations are separed from the definitions.

Why do you install the sources at all? As far as I can see you instantiate all templates in libf4.cpp anyway, so is there a need to copy the whole source code in local/include?

Well, would you prefer a broken installation, that cannot be used at sage -sh prompt?

malb commented 9 years ago
comment:24

Why would that be a broken installation? The library is - as far as I can see - perfectly usable without dropping all .inl files in local/include/f4/src.

jdemeyer commented 9 years ago
comment:25
  1. Is there an "upstream" page? It should be mentioned in SPKG.txt.

  2. Is this right?

== License ==

 * GNU GPLv3

It would be much better if it was under GPL version 3 or later. If this isn't the case, you should ask upstream if they are willing to release under GPL version 3 or later.

  1. Obviously, this shouldn't be there:
Put a bulleted list of dependencies here:
  1. In module_list.py, this shouldn't be needed
depends = [SAGE_INC + "/libf4.h"]

since you're explicitly refering to libf4.h in the .pyx file.

  1. For the copyright statement in groebner_basis_f4.pyx, please follow the format at http://doc.sagemath.org/html/en/developer/coding_basics.html#headings-of-sage-library-code-files

  2. When you're done, remove # commented out code, in particular this funny thing:

#include <Python.h>
  1. Format INPUT blocks like http://doc.sagemath.org/html/en/developer/coding_basics.html#the-docstring-of-a-function-content
7b29467c-ebd0-42a5-86cb-a291e9479b22 commented 9 years ago
comment:26

I update the code on https://github.com/nauotit/openf4 (the bug is fixed)

and the Sage files in the branch u/tcoladon/f4.

The new upstream name is openf4.

I don't understand your question about the licence?

If the sources installation is a problem, we can remove them (just change the Makefile.am in the src directory) in a special target dedicated to Sage?

I haven't deleted the # lines in file groebner_basis_f4.pyx in case Givaro version is updated.

For the comparison with Magma, I'm not sure that openf4 is better on smaller prime fields. Magma is very slow when the characteristic is higher than 2^24

jdemeyer commented 9 years ago
comment:27

Replying to @sagetrac-tcoladon:

I don't understand your question about the licence?

There is a difference between releasing between releasing a project under "GPL version 3" and "GPL version 3 or any later version". The latter is much preferred in case a GPL version 4 ever comes out and we want to be compatible with that.

7b29467c-ebd0-42a5-86cb-a291e9479b22 commented 9 years ago
comment:28

Thank you, the upstream is under GPLv3 or any later version, I have updated the SPKG.txt.

malb commented 9 years ago
comment:29

Which tarball do the hashes in checksums.ini refer to?

7b29467c-ebd0-42a5-86cb-a291e9479b22 commented 9 years ago
comment:30

I have updated http://nauotit.github.io/openf4/ with the checksums.ini refered tarball.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from cf1952a to 2d15c6a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

cfbb262Format Sage files and change upstream name: f4 -> openf4
57361bdChange licence from GNU GPLv3 to GPL version 3 or any later version
a15018aMerge branch 'u/tcoladon/f4' of trac.sagemath.org:sage into u/tcoladon/f4
dee79b4Merge branch 'u/tcoladon/f4' into f4
2d15c6aMerge branch 'u/tcoladon/f4' into f4
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 2d15c6a to cc7aa9f

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1f51227mark OpenMP as optional dependency in SPKG.txt
c8d4b94make executable files executable
5796942drop executable bit from multi_polynomial_ideal.py
e214336openf4 inteface is optional
6c82642move groebner_basis_f4 to openf4 and fix doctests
1240e77update copyright statement to Sage standard
cc7aa9fdocstring & indentation clean up
malb commented 9 years ago
comment:33

I should have addressed all issues raised on this ticket with respect to the Sage interface. As far as I can see, what remains to be done is to clean up Makefile.am to normalise what is installed where? Anything I overlooked?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from cc7aa9f to 33fbbfa

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

33fbbfaonly import openf4 if requested
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 33fbbfa to 252ddd1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

252ddd1mark more openf4 doctests optional
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 252ddd1 to b14d81e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b14d81eadd another optional flag for openf4
jdemeyer commented 9 years ago
comment:37

Replying to @malb:

Anything I overlooked?

It looks pretty good indeed.

I just noticed this Cython warning:

warning: sage/libs/openf4.pyx:135:29: Unreachable code

Don't forget to set the ticket to needs_review when you really want this to be reviewed.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e388b7ecomment out unreachable code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from b14d81e to e388b7e

malb commented 9 years ago
comment:39

Thanks, fixed the unreachable code thing. I'm in no position to set the ticket to "needs review". I think upstream should be cleaned up first (but not everyone agrees) and I'm also not the primary author anyway, just lending a hand on the Sage side.

7b29467c-ebd0-42a5-86cb-a291e9479b22 commented 9 years ago
comment:41

The sources installation is indeed required. libopenf4.cpp includes ideal.h which needs some *.inl files. If everything else is OK, I can change the ticket to "need review"?

malb commented 9 years ago
comment:42

Fine by me.