add a fast gcd algorithm for univariate polynomials over absolute number fields

lftabera commented 14 years ago

Question arised here,

http://groups.google.com/group/sage-devel/browse_thread/thread/0f5b029970e1a4e2/fcec7d0e35474fbd#fcec7d0e35474fbd

univariate gcd is performed using euclidean algorithm, which causes explosion of coefficients and is slow but for trivial examples.

For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.

Depends on #14186 Depends on #15803 Depends on #15804

CC: @slel

Component: algebra

Keywords: gcd, pari, ntl, days94

Author: Luis Felipe Tabera Alonso

Branch/Commit: u/lftabera/ticket/8558 @ f16a49b

Reviewer: Jeroen Demeyer

Issue created by migration from https://trac.sagemath.org/ticket/8558

lftabera commented 14 years ago

Attachment: 13911.patch.gz

lftabera commented 14 years ago

comment:2

The pari algorithm is not fast enough. I have implemented a modular algorithm using ntl.

The patch needs doctest and integration in sage (it is, right now, a separate function).

It adds a new function modular_gcd

To test it, apply the patch and

from sage.rings.polynomial.polynomial_absolute_number_field import modular_gcd

Some timmings: (800 Mhz i386 linux, 1Gb ram)

sage: N.<a>=NumberField(x^3 -123)         
sage: K.<t>=N[]                           
sage: f=K.random_element(degree=2)        
sage: g1=K.random_element(degree=15)      
sage: g2=K.random_element(degree=15)      
sage: h1=f*g1                             
sage: h2=f*g2                             
sage: %time modular_gcd(h1,h2,'pari')     
CPU times: user 0.30 s, sys: 0.00 s, total: 0.30 s
Wall time: 0.31 s                                 
t^2 + (-35396/663609*a^2 + 92498/663609*a + 1750733/663609)*t - 1312/663609*a^2 + 3026/221203*a + 66060/221203
sage: %time modular_gcd(h1,h2)                                                                                
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s                                                            
Wall time: 0.12 s                                                                                             
t^2 + (-35396/663609*a^2 + 92498/663609*a + 1750733/663609)*t - 1312/663609*a^2 + 3026/221203*a + 66060/221203
sage: %time modular_gcd(h1,h2,'euclidean')                                                                    
CPU times: user 11.28 s, sys: 0.05 s, total: 11.33 s                                                          
Wall time: 12.21 s                                                                                            
t^2 + (-35396/663609*a^2 + 92498/663609*a + 1750733/663609)*t - 1312/663609*a^2 + 3026/221203*a + 66060/221203

sage: N.<a>=NumberField(x^10 - 123)
sage: K.<t>=N[]
sage: f=K.random_element(degree=2)
sage: g1=K.random_element(degree=15)
sage: g2=K.random_element(degree=15)
sage: h1=f*g1
sage: h2=f*g2
sage: %time l=modular_gcd(h1,h2,'pari')
CPU times: user 30.06 s, sys: 0.02 s, total: 30.07 s
Wall time: 32.15 s
sage: %time l=modular_gcd(h1,h2,'modular')
CPU times: user 0.43 s, sys: 0.00 s, total: 0.43 s
Wall time: 0.47 s

lftabera commented 14 years ago

Changed keywords from gcd, pari to gcd, pari, ntl

lftabera commented 14 years ago

comment:3

Here there is a cleaner patch.

The patch adds a new class of univariate polynomials whose ground field is an absolute number field. There is a new _gcd method for this class. It actually implements several approaches to the modular gcd algorithm:

Langemyr-McCallum algorithm Encarnacion a mixture of the two previous (default)

moreover, a call to pari method and the old Euclidean method.

I think that there should be only one modular algorithm, but, as usual, there are cases in which any of them beats the others. So suggestions are welcome. It should probably be Encarnacion or the mixture of boths

some timmings for harder problems:

sage: N = NumberField(ZZ[x].random_element(15).monic(),'a')    
sage: K = N[x]
sage: f = K.random_element(100)
sage: g1 = K.random_element(100)
sage: g2 = K.random_element(100)
sage: g1 *= f
sage: g2 *= f
sage: %time _=gcd(g1,g2)
CPU times: user 7.32 s, sys: 0.02 s, total: 7.34 s
Wall time: 7.42 s

This example with the old implementation or even pari is unfeasible.

lftabera commented 14 years ago

comment:4

Improved documentation
Eliminated an improvement in a function that hos nothing to do with the ticket

lftabera commented 14 years ago

Author: Luis Felipe Tabera Alonso

lftabera commented 14 years ago

comment:5

Added the method also for number fields of degree 1 using Flint in this special case

lftabera commented 14 years ago

comment:6

Attachment: 14371.patch.gz

Now that #4000 has possitive review, I raise the priority of this one.

Updated patch to work in 4.5.3

Now, it works also for relative number fields, passing to an absolute representation and performing there the computations. This is not optimal but it is usable.

If the extension of the field is one, then the gcd of QQ[x] is used wich is faster.

lftabera commented 14 years ago

Attachment: trac-5885.patch.gz

JohnCremona commented 14 years ago

comment:7

I can confirm that the patch applies fine to 4.6.rc0 and all tests (including long) pass.

There are some minor problems with the documentation (just punctuation) -- now I am studying the code, since this looks like very useful functionality!

lftabera commented 13 years ago

comment:8

Some clean in the documentation, reorder of the methods (methods after the classes of the file) and a couple of small bugs (honor the method option for relative fields and duplicated code for degree 1 extensions). Rename correctly the patch.

Apply: trac-8558.patch

Try new buildbot :)

11d1fc49-71a1-44e1-869f-76be013245a0 commented 13 years ago

comment:9

Apply trac-8558.patch

(I think such messages have to come after the patch is uploaded)

lftabera commented 13 years ago

avoid next_prime, better exceptions

lftabera commented 13 years ago

Description changed:

--- 
+++ 
@@ -9,3 +9,5 @@
 2.- Add gcd using pari for absolute number fields

 3.- For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.
+
+Apply: trac-8558.patch

lftabera commented 13 years ago

comment:10

Attachment: trac-8558.patch.gz

malb commented 13 years ago

comment:11

Hi, I don't know much about fast computation in polynomial rings over number fields. However, I'm wondering why you have this bias towards functions as opposed to methods? For example, lift_pm looks like it should be a method on p instead of a function? Perhaps the same applies to the gcd functions? Or am I missing something?

lftabera commented 13 years ago

comment:12

No bias intended. lift_pm can be a method and probably in more classes than polynomials in NTL.

I am reading again the code and it looks awful now. I think there are too many gcd alternatives with little gainamong them.

I will update the code.

lftabera commented 13 years ago

comment:13

Now the functions are methods.

lift_to_poly_ZZ and lift_to_poly_QQ are methods of ntl_ZZ_pEX

lift_pm is a method of ntl_ZZ_p

Still, I have left a lift_pm function in sage.rings.arith I think it is good for educational purposes in modular algorithms.

I have eliminated pure Encarnacion and pure Langemyr_McCallum methods. There is no real benefit compared to the extra code they add.

lftabera commented 13 years ago

Description changed:

--- 
+++ 
@@ -10,4 +10,4 @@

 3.- For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.

-Apply: trac-8558.patch
+Apply: trac-8558.2.patch

lftabera commented 13 years ago

comment:14

Apply: trac-8558.2.patch

lftabera commented 13 years ago

comment:15

rebase against 4.7.1

lftabera commented 13 years ago

comment:16

Apply: trac-8558.2.patch

jdemeyer commented 13 years ago

Description changed:

--- 
+++ 
@@ -4,10 +4,8 @@

 univariate gcd is performed using euclidean algorithm, which causes explosion of coefficients and is slow but for trivial examples. Instead we should use pari that performs better.

-1.- Add a `_pari_` function for absolute number fields taht work
+1. Make `_pari_()` work for polynomials over absolute number fields, see #11904.
+2. Add gcd using pari for absolute number fields
+3. For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.

-2.- Add gcd using pari for absolute number fields
-
-3.- For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.
-
-Apply: trac-8558.2.patch
+Apply: [attachment: trac-8558.2.patch](https://github.com/sagemath/sage-prod/files/10648581/trac-8558.2.patch.gz)

jdemeyer commented 13 years ago

comment:18

Several comments:

Why the name lift_pm()? In PARI, this function is called centerlift() which is more understandable in my opinion. On the other hand, having the name begin with lift is more friendly for TAB-completion, so why not lift_centered or something?
A special class for polynomials over number fields looks like a good idea, but please put it in a separate file polynomial_number_field.pyx instead of adding to polynomial_element_generic.pyx.
I have a hard time understand the text below. What is c? What is n? Probably "reduction" is a better word than "project". And I also think the name K for a polynomial ring is badly chosen:

    def lift_to_poly_QQ(self,K):
        """
        Compute a lift of poly to K.

        INPUT:

        - K: an univariate polynomial ring over an absolute number field QQ[a].
        In order to make sense of this algorithm, the minimum polynomial of 'a'
        should project onto `c` modulo `m`.

        OUTPUT:

        -A polynomial in QQ[a] such that its projection coefficientwise is poly.
        This algorithm uses rational reconstruction, so it may fail.

Why _gcd() instead of gcd()?
Add some tests using the global gcd() function instead of always calling your _gcd() method.

Thanks to #11904, the line

h1 = pari([coeff._pari_('y') for coeff in self.list()]).Polrev()

can be changed to

h1 = pari(self)

This looks very fishy:
```
#Experimental bound IMPROVE
```

With

p = p.next_prime(False)

do you mean

p = p.next_prime(proof=False)

which is much clearer?

I believe the keyword argument algoritm= is usually used instead of method = (note also the spacing!).

jdemeyer commented 13 years ago

Reviewer: Jeroen Demeyer

lftabera commented 13 years ago

comment:20

rebase 4.7.2 still needs work

lftabera commented 11 years ago

comment:21

Back to business.

I have updated the patch with Jeroen's suggestions. Still I have to do something with

#Experimental bound IMPROVE
p = ZZ(3+min(2**255, (max(map(abs,Bound.list())).n()**(0.4)).floor())).next_prime(proof=False)

p is the first prime to perform a modular gcd. If p=2, we will have to use a lot of primes and will be inefficient. On the other hand, if p is too big, we lose the advantages of modular gcd. So we have to give a sane, intermediate default.

The prime above is based on some experiments I made two years ago, the idea is to use a prime such that we will have to use few modular gcd, but limiting our stating prime to the interval [3, 2^255].

Ideas welcomed.

lftabera commented 11 years ago

comment:22

Apply trac-8558.2.patch

lftabera commented 11 years ago

Dependencies: #14186

lftabera commented 11 years ago

comment:24

Still, we have to look for a sensible default starting prime p, but the rest of the ticket can be reviewed.

Further improvements for other tickets and for big degree polynomials (different tickets) could be

add a half-gcd algorithm for ntl_ZZ_p_EX class, ntl libraries do not provide one.
Improve a lot the test of having a gcd, remainder algorithm is slow.

Apply: trac-8558.2.patch

lftabera commented 11 years ago

comment:25

The patchbot wants to apply too many patches

Apply: trac-8558.2.patch

lftabera commented 11 years ago

Attachment: trac-8558.2.patch.gz

rebase to 5.8.beta2

lftabera commented 11 years ago

comment:26

Apply: trac-8558.2.patch

lftabera commented 11 years ago

Description changed:

--- 
+++ 
@@ -8,4 +8,4 @@
 2. Add gcd using pari for absolute number fields
 3. For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.

-Apply: [attachment: trac-8558.2.patch](https://github.com/sagemath/sage-prod/files/10648581/trac-8558.2.patch.gz)
+Apply: [attachment: trac-8558.2.2.patch](https://github.com/sagemath/sage-prod/files/10648582/trac-8558.2.2.patch.gz)

lftabera commented 11 years ago

comment:27

Apply: trac-8558.2.2.patch

jdemeyer commented 11 years ago

comment:28

Some quick comments:

_sig_on and _sig_off are very deprecated and should be changed to sig_on() and sig_off() (#10115).
The change "somewhat" -> "somewha" clearly is a mistake.
Documentation and library formatting must be fixed.
Could you replace centerlift by lift_centered in sage/rings/finite_rings/integer_mod.pyx for consistency?
As I already mentioned, replace method = by algorithm=
Since QQ is a unique parent, replace base_ring != QQ by base_ring is not QQ.
Remove # There has to be a better way to do this, self.change_ring() does not work. I don't see any problem with that code.
It is better to use a **kwds argument for sparse and implementation in

def __init__(self, base_ring, name="x", sparse=False, element_class=None, implementation=None)

lftabera commented 11 years ago

comment:30

I have taken into acoount the suggestions. I have deprecated integer_mod.centerlift and have eliminated the PolynomialRing new classes, since they only differed from PolynomialRing_field in the element_class and ignored the sparse option.

lftabera commented 10 years ago

rebase to 5.13

lftabera commented 10 years ago

comment:32

Attachment: trac-8558.2.2.patch.gz

Apply: trac-8558.2.2.patch

lftabera commented 10 years ago

Commit: 85ff4b9

lftabera commented 10 years ago

Branch: u/lftabera/ticket/8558

jdemeyer commented 10 years ago

comment:34

Certainly looks in much better shape than it used to be, but I'm afraid I don't know enough NTL to completely review this. Why not split up this ticket in two? The first which adds the new classes and implements gcd via PARI and the second which implements the modular algorithm.

jdemeyer commented 10 years ago

Description changed:

--- 
+++ 
@@ -2,10 +2,6 @@

 http://groups.google.com/group/sage-devel/browse_thread/thread/0f5b029970e1a4e2/fcec7d0e35474fbd#fcec7d0e35474fbd

-univariate gcd is performed using euclidean algorithm, which causes explosion of coefficients and is slow but for trivial examples. Instead we should use pari that performs better.
+univariate gcd is performed using euclidean algorithm, which causes explosion of coefficients and is slow but for trivial examples.

-1. Make `_pari_()` work for polynomials over absolute number fields, see #11904.
-2. Add gcd using pari for absolute number fields
-3. For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.
-
-Apply: [attachment: trac-8558.2.2.patch](https://github.com/sagemath/sage-prod/files/10648582/trac-8558.2.2.patch.gz)
+For relative number fields, pass to an absolute representation. This may be slow. But for the cases where this is slow the current implementation may be unfeasible.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 85ff4b9 to e29c687

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`ddb7291`	`Add lift_centered method to more classes, deprecate centerlift`
`1adefd5`	`Merge branch 'u/lftabera/lift_centered' into u/lftabera/ticket/8558`
`4cc4359`	`Create the classes of univariate polynomials over number fields`
`bb677fc`	`Add gcd method to polynomial_number_field using pari`
`e29c687`	`Merge branch 'u/lftabera/gcd_number_field_pari' into u/lftabera/ticket/8558`

lftabera commented 10 years ago

comment:38

I have split the ticket into three tickets for easier reviewer and to allow partial merging.

lftabera commented 10 years ago

Changed dependencies from #14186 to #14186, #15803, #15804

pjbruin commented 10 years ago

Work Issues: does not merge with 6.2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`e6bd63b`	`Deprecate centerlift method`
`169cb08`	`Make global lift_centered function fallback to lift if the object`
`868e571`	`Catch exceptions instead of using hassattr.`
`1266aaa`	`Merge branch 'u/lftabera/lift_centered' into u/lftabera/ticket/8558`
`341a8f7`	`Use proper constructions for the library. Minor fixes in other parts of the code.`
`3ff4ae8`	`Merge branch 'u/lftabera/gcd_number_field_pari' into u/lftabera/ticket/8558`
`f39d0ab`	`Merge branch 'master' into u/lftabera/ticket/8558`
`f57b651`	`fix import of lift_centered`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from e29c687 to f57b651

sagemath / sage

add a fast gcd algorithm for univariate polynomials over absolute number fields #8558