Implement QQ['x'] via Flint ZZ['x'] + denominator

malb commented 16 years ago

Bill Hart wrote on [sage-devel]:

""" Almost everything over Q should probably be converted to a problem over Z. I haven't seen any polynomial problems over Q which should not be dealt with this way so far, but I suppose they may exist. """

Further justification:

sage: f = R.random_element(2000)
sage: g = R.random_element(2000)
sage: fD = f.denominator()
sage: gD = g.denominator()
sage: fZ = (fD * f).change_ring(ZZ)
sage: gZ = (gD * g).change_ring(ZZ)
sage: %time _ = f*g
CPU times: user 0.63 s, sys: 0.02 s, total: 0.66 s
Wall time: 0.67 s

sage: %time _ = (fZ*gZ)
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.01 s

sage: %time _ = (fZ*gZ)/(fD*gD) 
CPU times: user 0.06 s, sys: 0.00 s, total: 0.06 s
Wall time: 0.06 s

sage: fM = magma(f)
sage: gM = magma(g)
sage: t = magma.cputime()
sage: _ = fM*gM
sage: magma.cputime(t)
0.059999999999999998

sage: f = R.random_element(4000) 
sage: g = R.random_element(4000) 
sage: fD = f.denominator()
sage: gD = g.denominator()
sage: fZ = (fD * f).change_ring(ZZ)
sage: gZ = (gD * g).change_ring(ZZ)
sage: %time _ = f*g
CPU times: user 2.11 s, sys: 0.00 s, total: 2.12 s
Wall time: 2.14 s
sage: %time _ = (fZ*gZ)
CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s
Wall time: 0.02 s
sage: %time _ = (fZ*gZ)/(fD*gD)
CPU times: user 0.14 s, sys: 0.01 s, total: 0.15 s
Wall time: 0.15 s
sage: fM = magma(f)
sage: gM = magma(g)
sage: t = magma.cputime()
sage: _ = fM*gM
sage: magma.cputime(t)
0.10000000000000001

CC: @burcin @sagetrac-drkirkby @sagetrac-spancratz @mwhansen @malb @jdemeyer @peterjeremy

Component: basic arithmetic

Author: Sebastian Pancratz, Martin Albrecht, William Stein, Jeroen Demeyer, Rob Beezer

Reviewer: John Cremona, Martin Albrecht, Alex Ghitza, Harald Schilly, William Stein, Mitesh Patel

Merged: sage-4.6.alpha3

Issue created by migration from https://trac.sagemath.org/ticket/4000

malb commented 15 years ago

Author: Martin Albrecht

malb commented 15 years ago

comment:2

The attached patch provides the basic skeleton for the proposed new implementation. The following already works with the attached patch:

sage: from sage.rings.polynomial.polynomial_rational_flint import Polynomial_rational_dense_flint
sage: P.<t> = QQ[]
sage: a = Polynomial_rational_dense_flint(P,1/2)
sage: b = Polynomial_rational_dense_flint(P,2/1)
sage: t = Polynomial_rational_dense_flint(P,is_gen=True)
sage: a*t
1/2*t
sage: a*t*b
t
sage: a*t*b*b
2*t
sage: a*t*b*(b*t)
2*t^2
sage: a*t*b*(b*t)*a
t^2

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:3

I've now implemented most methods in the prototype from the previous patch uploaded, and sent a message to sage-devel under the thread "Improving QQ['x']" with some questions.

Sebastian

malb commented 15 years ago

comment:4

Some remarks

the patch uses the old style docstring format, cf. http://wiki.sagemath.org/combinat/HelpOnTheDoc
you should claim copyright
cdef inline int _celement_canonicalise why not void?
all celement_foo functions should have doctests, which call the QQ[x] methods which call the celement_ functions
have you tried switching the default to this implementation to see how many doctests fail?

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:5

Hi Martin,

I've now implemented all methods in the prototype. (I'll attach a new patch in a few minutes.) All of your above comments make sense and I can go through this tomorrow. One thing I could not get done is make SAGE use this implementation by default. In the file polynomial_ring.py, I tried replacing the line 1185 with

from sage.rings.polynomial.polynomial_rational_flint import Polynomial_rational_dense_flint
element_class = Polynomial_rational_dense_flint

and while building still works, executing ./sage results in a whole screen full output, ending with

b37bb0/home/suser/sage-4.1.2.alpha0/local/bin/sage-sage: line 199: 16297 Aborted                 sage-ipython "$@" -i

Am I making a mistake in the way I am trying to switch the default, or does this due to problems in the actual new implementation? I've got no idea how to fix this. Of course, I am then happy to do lots of testing.

Kind regards,

Sebastian

malb commented 15 years ago

comment:6

I can take a look tomorrow to debug this.

malb commented 15 years ago

comment:7

I fixed the startup crash. I suggest you take a look at fmpq.diff to see what I changed. If you want to debug these kind of issues start Sage using sage -gdb or sage -valgrind (you will need to install the optional Valgrind SPKG for this to work). Note that there is still some conversion code missing in polynomial_rational_flint.pyx.

sage: P.<x> = QQ[]
sage: f = P.random_element(2000)
sage: g = P.random_element(2000)
sage: %time _ = f*g
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.02 s

sage: P.<x> = PolynomialRing(QQ,'x',implementation='pari')
sage: f = P.random_element(2000)
sage: g = P.random_element(2000)
sage: %time _ = f*g
CPU times: user 0.59 s, sys: 0.00 s, total: 0.59 s
Wall time: 0.59 s

sage: P.<x> = QQ[]
sage: f = P.random_element(5000)
sage: g = P.random_element(5000)
sage: %time _ = f*g
CPU times: user 0.03 s, sys: 0.00 s, total: 0.03 s
Wall time: 0.04 s

sage: fM = magma(f)
sage: gM = magma(g)
sage: t = magma.cputime()
sage: _ = fM*gM
sage: magma.cputime(t)
0.12

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:8

Thanks for the quick debugging and making the code accessible in SAGE for testing! I'll upload a new version of the patch later. Here are a few (unsorted) remarks:

The printing isn't "nice" yet: Rationals are still printed in the form r/s even if s divides r. In fact, all coefficients are printed with the common denominator of the polynomial. I've got an idea of how to fix that, but I am not sure it'll work; I'll give it a try later.
Say we set up to polynomial rings R[x] and S[y], one using the generic implementation and one using FLINT. Then sometimes (usually?) coercion like "f_flint = S(f_generic)" or "f_generic = R(f_flint)" works, but sometimes it ends in a segfault. For two random polynomials f and d, of two successive calls "q, s, t = xgcd(f, d)" the first one succeeded and the second one ended in a segfault. This seems very strange to me!
We achieve a performance gain except in the cases of addition and subtraction. (See below.)
The method xgcd doesn't give the right result yet, I'll look into that later.
I have no idea what you mean by "Note that there is still some conversion code missing in polynomial_rational_flint.pyx." Are there any examples of this in other files?
I'll write up the doctests later. Regarding your comments on using the "old" documentation style, I don't quite understand this.

Here is a complete (except for XGCD) list of performance comparisons, using the local installation of SAGE 4.1.2.alpha0 plus this patch on my laptop (Ubuntu 8.10, Intel Core 2 Duo). The first few tests, from comparison through to power, involve random polynomials f and g of degrees 2000, the division tests use random polynomials f and d of degrees 800 and 560, and for the GCD test f and d have degree 60 and 42. In each case, the first output line is for the generic implementation, the second output line is for the new implementation using FLINT.

```
Comparison: f == g
1 loops, best of 3: 10 µs per loop
1 loops, best of 3: 954 ns per loop
Comparison: f < g
1 loops, best of 3: 11 µs per loop
1 loops, best of 3: 1.91 µs per loop
Addition: f + g
1 loops, best of 3: 373 µs per loop
1 loops, best of 3: 2.26 ms per loop
Subtraction: f - g
1 loops, best of 3: 474 µs per loop
1 loops, best of 3: 2.23 ms per loop
Negation: -f
1 loops, best of 3: 12.9 ms per loop
1 loops, best of 3: 39.8 µs per loop
Multiplication: f * g
1 loops, best of 3: 549 ms per loop
1 loops, best of 3: 15.9 ms per loop
Power: f ** 4
1 loops, best of 3: 15.1 s per loop
1 loops, best of 3: 63.7 ms per loop
Division: q, r = f.quo_rem(d)
1 loops, best of 3: 2.42 s per loop
1 loops, best of 3: 177 ms per loop
Division: q = f // d
1 loops, best of 3: 2.43 s per loop
1 loops, best of 3: 63.9 ms per loop
Division: r = f % d
1 loops, best of 3: 2.43 s per loop
1 loops, best of 3: 193 ms per loop
GCD
1 loops, best of 3: 1.81 s per loop
1 loops, best of 3: 88.9 µs per loop
```

Sebastian

malb commented 15 years ago

comment:9

Replying to @sagetrac-spancratz:

Say we set up to polynomial rings R[x] and S[y], one using the generic implementation and one using FLINT. Then sometimes (usually?) coercion like "f_flint = S(f_generic)" or "f_generic = R(f_flint)" works, but sometimes it ends in a segfault. For two random polynomials f and d, of two successive calls "q, s, t = xgcd(f, d)" the first one succeeded and the second one ended in a segfault. This seems very strange to me!

Try running Sage with sage -gdb and/or sage -valgrind. The later requires the optional Valgrind SPKG. The output of valgrind is incredibly useful and can be found in ~/.sage/valgrind. If you don't get anywhere, I can take a look. But learning Valgrind is well worth it :)

We achieve a performance gain except in the cases of addition and subtraction. (See below.)

We should think about how to make it more efficient, e.g. by only multiplying by the multiplier to get the LCM? Magma can do it faster than what we can do it seems.

The method xgcd doesn't give the right result yet, I'll look into that later.

I have no idea what you mean by "Note that there is still some conversion code missing in polynomial_rational_flint.pyx." Are there any examples of this in other files?

You are right, the overflow I was expecting doesn't happen (I think this is handled correctly in the base ring). We should consider making x + 1 (1 either int or Integer) fast though by writing special code similar to the Rational code in the __init__ function of polynomial_rational_flint.pyx. Also, construction from a list P([1,2,3,4]) should be made faster, cf. the zmod_poly implementation.

I'll write up the doctests later. Regarding your comments on using the "old" documentation style, I don't quite understand this.

You wrote e.g. \code{foo} which is the old LaTeX style. It should be using the Sphinx markup now.

Here is a complete (except for XGCD) list of performance comparisons, using the local installation of SAGE 4.1.2.alpha0 plus this patch on my laptop (Ubuntu 8.10, Intel Core 2 Duo). The first few tests, from comparison through to power, involve random polynomials f and g of degrees 2000, the division tests use random polynomials f and d of degrees 800 and 560, and for the GCD test f and d have degree 60 and 42. In each case, the first output line is for the generic implementation, the second output line is for the new implementation using FLINT.

This is encouraging!

malb commented 15 years ago

comment:10

Also, I think our design for .den is false. It shouldn't be preallocated because this makes you call realloc, i.e. we have two system calls instead of one. This is quite expensive.

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:11

Actually, I am not quite sure about this. When working with a random polynomial of degree 2000, which will have lots of non-zero entries all of type fmpz_t, it shouldn't really matter whether we manually initialise a few more for the denominators.

I've tried implementing the denominator with the convention that it is either NULL (which should be interpreted as one) or initialised to a positive integer. But this didn't really change the performance.

Another idea, which will sometimes help to keep numbers small, is to instead represented the polynomial over the rationals as (num / dem) prim where num / dem is a rational number in reduced form and prim is a primitive integer polynomial with positive leading coefficient. Obviously, this change vastly improved the performance of negation (which then only operates on the rational number and leaves the integer polynomial part alone). But it didn't change much apart from that. Anyway, given we need to compute the content of the numerator anyway to ensure that it is coprime to the denominator, we might as well store it separately. I'll implement this throughout the patch and upload a new version later today.

This still leaves the problem: How can we speed up addition?

At the moment, I don't have any further ideas. In fact, I think it might perhaps be the case that we simply can't, since in this kind of implementation we have to at least do a few polynomial scalar multiplications (and perhaps polynomial scalar divisions as well as integer gcd computations to maintain the form of the representation) plus all the coefficient additions. In contrast to this, implementing polynomials as an array of coefficients one only has to do the (rational!) coefficient additions.

So the next things I'll do are

Change the fmpq_poly_t data type
Change all the methods accordingly
Write docstrings

Does anyone have any ideas about the addition?

Sebastian

malb commented 15 years ago

comment:12

Replying to @sagetrac-spancratz:

Actually, I am not quite sure about this. When working with a random polynomial of degree 2000, which will have lots of non-zero entries all of type fmpz_t, it shouldn't really matter whether we manually initialise a few more for the denominators.

We shouldn't forget about small polynomials, they should be fast too. Two instead of one system call sounds rather expensive to me for basic arithmetic.

I've tried implementing the denominator with the convention that it is either NULL (which should be interpreted as one) or initialised to a positive integer. But this didn't really change the performance.

Did you try small examples? Also, how much does the realloc trick you implemented give you?

Another idea, which will sometimes help to keep numbers small, is to instead represented the polynomial over the rationals as (num / dem) prim where num / dem is a rational number in reduced form and prim is a primitive integer polynomial with positive leading coefficient. Obviously, this change vastly improved the performance of negation (which then only operates on the rational number and leaves the integer polynomial part alone). But it didn't change much apart from that. Anyway, given we need to compute the content of the numerator anyway to ensure that it is coprime to the denominator, we might as well store it separately. I'll implement this throughout the patch and upload a new version later today.

This still leaves the problem: How can we speed up addition?

Did you try the LCM idea? Rationale:

sage: P.<x> = QQ[]
sage: f = P.random_element(3000)
sage: g = P.random_element(3000)
sage: fD = f.denominator()
sage: gD = g.denominator()
sage: (fD*gD).nbits()
320
sage: (fD.lcm(gD)).nbits()
228

At the moment, I don't have any further ideas. In fact, I think it might perhaps be the case that we simply can't, since in this kind of implementation we have to at least do a few polynomial scalar multiplications (and perhaps polynomial scalar divisions as well as integer gcd computations to maintain the form of the representation) plus all the coefficient additions. In contrast to this, implementing polynomials as an array of coefficients one only has to do the (rational!) coefficient additions.

Well, this other implementation would have to do quite a few rational additions where it would have to deal with denominators quite a bit. I am not convinced yet it has to be this slow. You could also ask on [sage-devel] I am sure, e.g. Bill Hart (main author of FLINT) would have some cool ideas.

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:13

I am sorry for the delay in working on this. Rather than trying the approach of writing the polynomial as r A / s, I've tried again to write this as A / s only as you laid out initially, this time trying really hard to avoid allocating anything new. Not everything is working again yet, I still need to re-write the three division functions, exponentiation and the two gcd functions. The upside is that everything seems to be lots faster now :).

Hopefully I'll be able to upload something useful tomorrow.

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:14

The switch to include NULL denominators still isn't quite done. However, I think addition and multiplication are bug-free already and show another massive improvement in speed. There are definitely still bugs in the division method, and the modular exponentiation as well as the gcd methods aren't implemented yet. I should be able to look into this tomorrow.

Sebastian

malb commented 15 years ago

comment:15

I can't test the most current patch (on geom.math):

sage: P.<x> = PolynomialRing(QQ)
sage: f = P.random_element(2000)
...
__celement_den_fit_limbs
Error: division by zero!
/scratch/malb/sage-4.1.2.alpha1/local/bin/sage-sage: line 199: 12195 Aborted                 sage-ipython "$@" -i

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:16

Hi Martin,

I am sorry for that. Last night and this morning I fixed another couple of bugs. In a few minutes, I'll upload a new patch (or rather, again the difference from the 20090911 patch). By the way, for debugging purposes all methods in the linkage file now begin with printing the method's name (although in all but the floordiv method, that line begins with '#').

Random (performance!, not correctness...) tests ran fine last night for polynomials of degree 2000 for the methods ==, <, +, -, neg, *, ^. I thought the three division methods should work fine now, until I stumbled across the following segfault:

```
sage: S.<y> = QQ[]
sage: f = -3 * y^10 - 4 * y^9
sage: g = (1/2) * y^6 + 3 * y^5 + 160930 * y^4
sage: f // g
celement_floordiv
Perform pseudo division -3*x^10-4*x^9 x^6+6*x^5+321860*x^4

------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occured in SAGE.
This probably occured because a *compiled* component
of SAGE has a bug in it (typically accessing invalid memory)
or is not properly wrapped with _sig_on, _sig_off.
You might want to run SAGE under gdb with 'sage -gdb' to debug this.
SAGE will now terminate (sorry).
------------------------------------------------------------
```

This strikes me as very odd because the segfault seems to occur in the call fmpz_poly_pseudo_div(q.num, &m, a.num, b.num) with a.num the polynomial -3*x<sup>10-4*x</sup>9 and b.den the polynomial x<sup>6+6*x</sup>5+321860*x^4. Perhaps you could have a look at this one?

I haven't looked at the two gcd methods yet, but I'll do that later today or tomorrow.

As the last question about the implementation (for this method), I noticed that polynomials over QQ in SAGE have the method denominator, which clearly this implementation should overwrite. On which level/ in which file should this be done?

Finally, here are the performance timings, in each case for ten random polynomials of degree 2000, first the time for the generic implementation and then the time for this implementation with FLINT:

== - 20µs, 1µs
< - 20µs, 1µs
+ - 400µs, 100µs
- - 400µs, 100µs
neg - 20ms, 20µs
* - 500ms, 1ms
^ (to the 4th power) - 15s, 30µs

Kind regards,

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:17

Hi Martin,

I just started to look at the gcd methods again and I also looked at the logic in polynomial_template.pxi. Here's the question:

Since the gcd of two polynomials is only defined up to multiplication by rationals, what's the right way of dealing with this? I think one can make a good argument for always returning the same normalisation. This would also mean that we do not necessarily have gcd(a,0) == a. This is currently the way it's handled in the file polynomial_template.pxi. If we want to normalise the gcd, in which way should this be done? If it's non-zero..

Monic rational polynomial
Primitive integer polynomial with positive leading coefficient

Of course, there are lots more but I think these two might be the most sensible choices.

The first one has the advantage that it generalises to all polynomial rings (over commutative rings with 1, at least). Upon adding a method returning the monic scalar multiple of a polynomial to the template file, one can still handle the cases of gcd(a,0) and gcd(0,b) in the template file.

Personally though, I am more in favour of the second option, since this might lead to faster code when working with QQ[]. In this case, we should remove the handling of the above two cases from the template file and always pass the call on to celement_gcd. This would mean that we leave the normalisation up to the actual implementation of the polynomial ring, rather than enforcing it across all base rings using the template file. We would then also have to make sure that all celement_gcd methods are happy to deal with zero arguments.

What do you think?

Sebastian

malb commented 15 years ago

comment:18

Replying to @sagetrac-spancratz:

Perhaps you could have a look at this one?

I will (hopefully) take a look later this week.

As the last question about the implementation (for this method), I noticed that polynomials over QQ in SAGE have the method denominator, which clearly this implementation should overwrite. On which level/ in which file should this be done?

You would add a method denominator() to Polynomial_rational_dense_flint.

Finally, here are the performance timings, in each case for ten random polynomials of degree 2000, first the time for the generic implementation and then the time for this implementation with FLINT:

If I understand this correctly, then addition is 20x faster than the previous implementation just because you avoid a remalloc?

malb commented 15 years ago

comment:19

Replying to @sagetrac-spancratz:

Since the gcd of two polynomials is only defined up to multiplication by rationals, what's the right way of dealing with this? I think one can make a good argument for always returning the same normalisation. This would also mean that we do not necessarily have gcd(a,0) == a. This is currently the way it's handled in the file polynomial_template.pxi. If we want to normalise the gcd, in which way should this be done? If it's non-zero..

I think we should have gcd(a,0) = 1 because this is what gcd(1/2,0) returns. I would like to avoid to put this logic in the celement_gcd implementations but if we have to then ... well we have to :)

Personally though, I am more in favour of the second option, since this might lead to faster code when working with QQ[]. In this case, we should remove the handling of the above two cases from the template file and always pass the call on to celement_gcd. This would mean that we leave the normalisation up to the actual implementation of the polynomial ring, rather than enforcing it across all base rings using the template file. We would then also have to make sure that all celement_gcd methods are happy to deal with zero arguments.

This might be worth raising on [sage-devel] where people care much more about this than I do, i.e. I guess it is a relevant corner case for number theory and thus people might have strong feelings about it?

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:20

As the last question about the implementation (for this method), I noticed that polynomials over QQ in SAGE have the method denominator, which clearly this implementation should overwrite. On which level/ in which file should this be done?

You would add a method denominator() to Polynomial_rational_dense_flint.

OK, I'll do that.

Finally, here are the performance timings, in each case for ten random polynomials of degree 2000, first the time for the generic implementation and then the time for this implementation with FLINT:

If I understand this correctly, then addition is 20x faster than the previous implementation just because you avoid a remalloc?

Yes. Actually, throughout I am now trying very hard to re-use variables rather than allocating new variables all over the place. It makes the code quite ugly... but definitely faster, which this is about, right? :)

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:21

Replying to @malb:

Replying to @sagetrac-spancratz: I think we should have gcd(a,0) = 1 because this is what gcd(1/2,0) returns. I would like to avoid to put this logic in the celement_gcd implementations but if we have to then ... well we have to :)

I didn't mean the above for rational numbers a, but for rational polynomials a. Your integer example above highlights that gcd doesn't necessarily guarantee gcd(a, 0) == a. The behaviour of gcd for integers suggests the method should return the monic normalisation. However, the current logic in template_polynomial.pxi doesn't do this, for example:

```
sage: R.<t> = PolynomialRing(IntegerModRing(3), 't')
sage: f = 2*t + 1
sage: type(f)
<type 'sage.rings.polynomial.polynomial_zmod_flint.Polynomial_zmod_flint'>
sage: gcd(f, R(0))
2*t + 1
```

In the above case, the monic version would be t + 2.

This might be worth raising on [sage-devel] where people care much more about this than I do, i.e. I guess it is a relevant corner case for number theory and thus people might have strong feelings about it?

OK, I'll do this.

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:22

I have now opened another ticket, #6941, to change the template implementation, pushing all the logic into the celement_foo methods rather than taking away the cases gcd(a,0) and gcd(0,b) on a higher level. The patch is very short --- it only does the small expected changes to the template file and the GF2X and ZMOD linkage files, plus one other file in the hyperelliptic curve module where the current behaviour of xgcd is used.

Martin, would you be happy to review that patch?

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:23

I've now implemented almost all functionality from the former generic implementation, most of it massively improved through FLINT. There are also some new methods. All of this is in the patch I uploaded just now, still as a difference to the 20090911 patch. Running make test still results in many failures, although fewer than last week.

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:24

Apart from a bad indentation in polynomial_quotient_ring_element, for which I didn't want to re-upload a patch, I am now down to the following doctest failures:

    ```
    sage -t  "devel/sage/sage/schemes/elliptic_curves/padic_lseries.py"
    sage -t  "devel/sage/sage/schemes/elliptic_curves/ell_generic.py"
    sage -t  "devel/sage/sage/schemes/elliptic_curves/padics.py"
    sage -t  "devel/sage/sage/schemes/elliptic_curves/ell_curve_isogeny.py"
    sage -t  "devel/sage/sage/tests/book_stein_modform.py"
    sage -t  "devel/sage/sage/rings/qqbar.py"
    sage -t  "devel/sage/sage/rings/number_field/number_field_element.pyx"
    sage -t  "devel/sage/sage/modular/modform/element.py"
    sage -t  "devel/sage/sage/modular/overconvergent/genus0.py"
    sage -t  "devel/sage/sage/modular/hecke/submodule.py"
    sage -t  "devel/sage/sage/structure/sage_object.pyx"
    ```

All but one of them are memory problems, either in mpq_canonicalize (called in the __getitem__ method) or in fmpz_poly_mul called in celement_mul. At the moment, I do not know how to resolve these.

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:25

Almost all of the memory problems are now resolved. They were arising because I wrongly assumed fmpz_ methods (not fmpz_poly_; they work just fine) supported aliasing of the inputs and outputs. Apart from the method fmpz_neg, I think I have fixed this in all places where I am using it, apart from the square-free decomposition in polynomial_rational_flint.pyx. I'll rewrite that, too, but I've already checked that this method does not get called in the following last two remaining doctest failures:

```
The following tests failed:

    sage -t  "devel/sage/sage/rings/qqbar.py"
    sage -t  "devel/sage/sage/structure/sage_object.pyx"
```

The test in qqbar.py that seems to break down is the following piece of code:

```
sage: x = polygen(AA)
sage: r = QQbar.polynomial_root(x^5 - x - 1, CIF(RIF(0.1, 0.2), RIF(1.0, 1.1))); r
sage: r.real().minpoly()
```

The test in sage_object.pyx that breaks has to do with unpickling, and it's triggered by the following two lines:

```
sage: std = os.environ['SAGE_DATA'] + '/extcode/pickle_jar/pickle_jar.tar.bz2'
sage: sage.structure.sage_object.unpickle_all(std)
```

I will try to chase down the first problem a little further than the minpoly() function, perhaps I can resolve it myself. But any help with the second problem in particular would be very much appreciated.

Sebastian

mwhansen commented 15 years ago

comment:26

I've added a patch which fixes the unpickling problem making sure that the old polynomials unpickle as the new polynomials.

mwhansen commented 15 years ago

comment:27

This gives the same answers for me before and after the patch.

    sage: x = polygen(AA)
    sage: r = QQbar.polynomial_root(x^5 - x - 1, CIF(RIF(0.1, 0.2), RIF(1.0, 1.1))); r
    sage: r.real().minpoly()

The only difference is the test on line 2262. It expected

    cp = AA.common_polynomial(1/2*x^4 - 1/95*x^3 - 1/2*x^2 - 4)

but got

    cp = AA.common_polynomial(x^4 - 2/95*x^3 - x^2 - 8)

I think that this is okay since they only differ by a multiple of 2 and thus have the exact same roots.

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:28

Hello Mike,

Thank you for fixing the unpickling problem!

About the second problem, the one in qqbar.py, are you saying that the following code

```
sage: x = polygen(AA)
sage: r = QQbar.polynomial_root(x^5 - x - 1, CIF(RIF(0.1, 0.2), RIF(1.0, 1.1))); r
sage: r.real().minpoly()
```

now executes without problems on your setup? On mine, it still crashes with a segmentation fault. I've uploaded the complete traceback to http://sage.pastebin.com/m5249e09. I still seems strange to me that the traceback doesn't seem to contain methods that this patch modifies directly.

The other difference you mention is no problem, of course. That's because I have taken care to ensure that methods returning results that are only defined up to units return monic normalisations.

Many thanks again for taking a look at this patch,

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:29

As said, the above three lines of code extracted from the qqbar.py doctests still cause a problem for me. I've chased it down for the last three hours now, and the following code breaks on my setup:

```
sage: R.<x> = QQ[]
sage: f = 422826864750/4773824138704099*x^18 - 8134231405059/9547648277408198*x^16 + 11311262264874/4773824138704099*x^14 - 12814039341867/4773824138704099*x^12 - 8509019074752/4773824138704099*x^10 + 707815020483605/9547648277408198*x^8 - 1781974116019893/4773824138704099*x^6+ 1316925435907659/4773824138704099*x^4 - 1088322011947813/9547648277408198*x^2 - 1/2*x + 1289415905296105/4773824138704099
sage: g = -76937/62774*x^19 - 30011/62774*x^18 + 144945/31387*x^17 + 174999/62774*x^16 - 377075/31387*x^15 - 354028/31387*x^14 + 929437/62774*x^13 + 983229/62774*x^12 - 725164/31387*x^11 - 984029/31387*x^10 + 945031/62774*x^9 + 1132829/31387*x^8 + 277343/31387*x^7 - 1107925/62774*x^6 - 432756/31387*x^5 - 23909/62774*x^4 + 202423/31387*x^3 + 167709/31387*x^2 - 10729/31387*x - 47216/31387
sage: f(g)
```

I've upload a complete log of the session to [url]http://sage.pastebin.com/m7757deba[/url]. I am happy also re-implement polynomial composition using FLINT, it should be a lot faster than the generic code for this anyway. (Idea: To compose F = f/d with G = g/e, where f, g are in ZZ[] and d, e are integers, first "rescale" F by 1/e --- this method is implemented already --- and then compose the new polynomial with g. There is a FLINT function for the last part.) However, I don't know how or where the generic code is implemented in SAGE.

Sebastian

mwhansen commented 15 years ago

comment:30

Replying to @sagetrac-spancratz:

As said, the above three lines of code extracted from the qqbar.py doctests still cause a problem for me. I've chased it down for the last three hours now, and the following code breaks on my setup:
```
sage: R.<x> = QQ[]
sage: f = 422826864750/4773824138704099*x^18 - 8134231405059/9547648277408198*x^16 + 11311262264874/4773824138704099*x^14 - 12814039341867/4773824138704099*x^12 - 8509019074752/4773824138704099*x^10 + 707815020483605/9547648277408198*x^8 - 1781974116019893/4773824138704099*x^6+ 1316925435907659/4773824138704099*x^4 - 1088322011947813/9547648277408198*x^2 - 1/2*x + 1289415905296105/4773824138704099
sage: g = -76937/62774*x^19 - 30011/62774*x^18 + 144945/31387*x^17 + 174999/62774*x^16 - 377075/31387*x^15 - 354028/31387*x^14 + 929437/62774*x^13 + 983229/62774*x^12 - 725164/31387*x^11 - 984029/31387*x^10 + 945031/62774*x^9 + 1132829/31387*x^8 + 277343/31387*x^7 - 1107925/62774*x^6 - 432756/31387*x^5 - 23909/62774*x^4 + 202423/31387*x^3 + 167709/31387*x^2 - 10729/31387*x - 47216/31387
sage: f(g)
```
I've upload a complete log of the session to [url]http://sage.pastebin.com/m7757deba[/url]. I am happy also re-implement polynomial composition using FLINT, it should be a lot faster than the generic code for this anyway. (Idea: To compose F = f/d with G = g/e, where f, g are in ZZ[] and d, e are integers, first "rescale" F by 1/e --- this method is implemented already --- and then compose the new polynomial with g. There is a FLINT function for the last part.) However, I don't know how or where the generic code is implemented in SAGE.

Sebastian

This does not crash for me on 64-bit linux with both FLINT 1.3.0 and FLINT 1.5.0. You should try the FLINT 1.5.0 spkg at http://sage.math.washington.edu/home/mhansen/flint-1.5.0.spkg.

--Mike

sage: f = 422826864750/4773824138704099*x^18 - 8134231405059/9547648277408198*x^16 + 11311262264874/4773824138704099*x^14 - 12814039341867/4773824138704099*x^12 - 8509019074752/4773824138704099*x^10 + 707815020483605/9547648277408198*x^8 - 1781974116019893/4773824138704099*x^6+ 1316925435907659/4773824138704099*x^4 - 1088322011947813/9547648277408198*x^2 - 1/2*x + 1289415905296105/4773824138704099
sage: g = -76937/62774*x^19 - 30011/62774*x^18 + 144945/31387*x^17 + 174999/62774*x^16 - 377075/31387*x^15 - 354028/31387*x^14 + 929437/62774*x^13 + 983229/62774*x^12 - 725164/31387*x^11 - 984029/31387*x^10 + 945031/62774*x^9 + 1132829/31387*x^8 + 277343/31387*x^7 - 1107925/62774*x^6 - 432756/31387*x^5 - 23909/62774*x^4 + 202423/31387*x^3 + 167709/31387*x^2 - 10729/31387*x - 47216/31387
sage: f(g)
}}}
I've upload a complete log of the session to [url]http://sage.pastebin.com/m7757deba[/url]. I am happy also re-implement polynomial composition using FLINT, it should be a lot faster than the generic code for this anyway. (Idea: To compose F = f/d with G = g/e, where f, g are in ZZ[] and d, e are integers, first "rescale" F by 1/e --- this method is implemented already --- and then compose the new polynomial with g. There is a FLINT function for the last part.) However, I don't know how or where the generic code

5689d088-9d81-41e2-b555-2341cea5bc21 commented 15 years ago

comment:31

Firstly, I am sorry for the bad patches I uploaded earlier --- I didn't realise that new files aren't included in a patch by default. I have changed this now and uploaded a new complete patch fmpq_20090925.patch.

There is one problem with the squarefree_decomposition method, which for some unknown reason was causing memory failures in the latest version. For the time being, I've just changed it back to my earlier code, which still uses bad aliasing of arguments to fmpz_ methods.

Mike: Thanks for taking a look at the composition problem, too. The failure on my setup must be rather strange, since the traceback also includes finance/fractal.so. I don't think I understand the __call__ method well enough to do much about it. In any case, I think the current code (catching polynomial composition, and otherwise passing the call on to Polynomial.__call__) should be preferable.

I am not quite sure what I should do at this point. I think it would be best to wait until the release of the next stable release of SAGE and then look at this again with the goal to have sorted out as soon as possible. What do other people think? If someone has a suggestion for what I should do at the moment, while I might not have too much time during the next few weeks, I should definitely be able to look at this every weekend.

Sebastian

malb commented 15 years ago

comment:32

Sebastian, what's the current status of this code? What needs to be done to finish it etc?

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:33

The plan is to re-base this on #383, which should take care of two segfaults that currently remain.

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:34

I've now added two patches to this. The first one trac383.patch contains all three patches from ticket #383. The second patch trac4000_rebase_431rc0_383.patch is the main patch from this patch, which is now rebased on 4.3.1.rc0 and the first patch. With this, the only remaining doctests failures are

sage -t  "devel/sage-qq/sage/combinat/species/composition_species.py"
**********************************************************************
File "/scratch/pancratz/sage-4.3.1.rc0/devel/sage-qq/sage/combinat/species/composition_species.py", line 235:
    sage: S.isotype_generating_series().coefficients(5) #indirect
Expected:
    [1, t, t^2 + t, t^3 + t^2 + t, t^4 + t^3 + 2*t^2 + t]
Got:
    [1, t, 1/2*t^2, 1/6*t^3, 1/24*t^4]
**********************************************************************
File "/scratch/pancratz/sage-4.3.1.rc0/devel/sage-qq/sage/combinat/species/composition_species.py", line 247:
    sage: Par.isotype_generating_series().coefficients(5)
Expected:
    [1, t, t^2 + t, t^3 + t^2 + t, t^4 + t^3 + 2*t^2 + t]
Got:
    [1, t, 1/2*t^2 + 1/2*t, 1/6*t^3 + 1/2*t^2 + 1/6*t, 1/24*t^4 + 1/4*t^3 + 7/24*t^2 + 1/24*t]
**********************************************************************
1 items had failures:
   2 of  15 in __main__.example_11
***Test Failed*** 2 failures.
For whitespace errors, see the file /home/pancratz/.sage//tmp/.doctest_composition_species.py
         [5.7 s]
exit code: 1024

----------------------------------------------------------------------
The following tests failed:

        sage -t  "devel/sage-qq/sage/combinat/species/composition_species.py"

I am not sure what's going on here. Could someone else please take a look at this?

Thanks!

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

Changed author from Martin Albrecht to Sebastian Pancratz, Martin Albrecht

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:35

In addition to my earlier post, I seems there is a problem with the first patch (the one collecting the three patches from #383 for convenience), which I am sorry for. Nonetheless, applying the three patches straight from that ticket and adding the second patch above yields the desired state, apart from the one remaining doctest.

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:36

The above ticket provided by Mike Hanses applies to 4.3.1.rc0 after applying the three tickets from #383. We've tested on two separate machines and it passes all doctests.

Thanks to Mike for helping to track down the (hopefully) last remaining bug before lunchtime!

I'll go over the code again in the next week or two, adding further documentation and more doctests. In the meantime, I would be very grateful if other people interested in reviewing it could start looking at the code and provide further comments.

Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:37

In the above three patches, I've now added some further documentation and made cosmetic changes to the layout in some files. I still want to add many more test cases for the polynomial arithmetic. Please let me know if there is something else that you think I ought to change.

Sebastian

aghitza commented 14 years ago

comment:38

Sebastian, I am marking this ticket as needs_work since you are saying that you want to add more tests. Just mark it needs_review when you feel it's ready.

malb commented 14 years ago

comment:39

@Sebastian, what do I have to apply in which order to test your patches? @Alex, we can probably start testing stuff except we shouldn't complain about missing doctests yet.

mwhansen commented 14 years ago

comment:40

I believe all of the above patches in the order they appear.

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:41

Martin: Yes, they should apply to 4.3.1.rc0 in the order they appear. Although, after applying the main patch, there should be some choice on the order in which you apply the patches as they mostly (apart from fmpq_poly and fmpq_poly_alias) touch distinct sets of files, I think.

Morally speaking, this should definitely be needs_review (although I understand if someone would formally want to argue that this should be needs_work), and it'd be great if you could let me know of any changes I should make, including doctests. The ones I intend to add weren't for any specific method, but rather for the arithmetic in \QQ[t], which I think would have to go to the top of the file polynomial_rational_flint since the methods themselves are in the polynomial template file.

Thanks,

Sebastian

aghitza commented 14 years ago

comment:42

Having applied all the patches, I'm getting:

sage: R.<x> = QQ[]
sage: S.<a> = R.quotient(3*x^3 + 3/2*x -1/3)
sage: 3 * a^3 + S.modulus()
Error: unable to alloc/realloc memory
/home/ghitza/sage-devel/local/bin/sage-sage: line 206: 13092 Aborted                 (core dumped) sage-ipython "$@" -i

(This is a doctest in rings/polynomial/polynomial_quotient_ring_element.py, which is how I ran into it.)

I don't know if it matters, but this is happening on a 32-bit machine.

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:43

There's clearly something dodgy going on here. On my 32-bit laptop, with a clean 4.3.1.rc0 install and only the patches from this thread,

sage: R.<x> = QQ[]
sage: S.<a> = R.quotient(3*x^3 + 3/2*x -1/3)
sage: 3 * a^3 + S.modulus()
-3/2*a + 1/3
sage: timeit('_ = 3 * a^3 + S.modulus()')
5 loops, best of 3: 14.3 s per loop

That is, it takes forever... Alex, could you perhaps elaborate on your setup?

Thanks, Sebastian

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:44

Here is a simpler instance of the problem:

sage: R.<x> = QQ[]
sage: f = 3/2*x - 1/3
sage: %time _ = f % f
CPU times: user 5.67 s, sys: 0.17 s, total: 5.84 s
Wall time: 5.86 s

I do not think that the problem is a coercion problem. After inserting "print" statements into the Cython code at various points, I instead think the code in the block from line 881 in fmpq_poly_linkage.pxi actually takes this long, although I do not understand at all why this might be the case.

The two lines at actually seem to take time (assuming that inserting "print" statements is a valid way to determine this) are

    fmpz_pow_ui(t, lead, m)
    fmpz_mul(r.den, t, a.den)

but, once again, I've got not clue why this might be the case.

Tomorrow or on Tuesday, I will try to reproduce the problem in plain C. If I manage to do this, I'll forward it to Bill Hart. If not, I wouldn't really know what else to look into.

Sebastian

aghitza commented 14 years ago

comment:45

Replying to @sagetrac-spancratz:

There's clearly something dodgy going on here. On my 32-bit laptop, with a clean 4.3.1.rc0 install and only the patches from this thread,
sage: R.<x> = QQ[]
sage: S.<a> = R.quotient(3*x^3 + 3/2*x -1/3)
sage: 3 * a^3 + S.modulus()
-3/2*a + 1/3
sage: timeit('_ = 3 * a^3 + S.modulus()')
5 loops, best of 3: 14.3 s per loop

I've tried a couple more times and I'm still getting the memory problem followed by crash and core dump. I'm running 32-bit archlinux on a Dell laptop, with version 4.4.2 of gcc. It's a clean build of sage-4.3.1 with the patches here.

I also have a macbook running 64-bit archlinux. It's busy doing other things now, but I can try to test this on it later.

aghitza commented 14 years ago

comment:46

Replying to @sagetrac-spancratz:

Here is a simpler instance of the problem:

sage: R.<x> = QQ[]
sage: f = 3/2*x - 1/3
sage: %time _ = f % f
CPU times: user 5.67 s, sys: 0.17 s, total: 5.84 s
Wall time: 5.86 s

On my laptop, this gives me

----------------------------------------------------------------------
| Sage Version 4.3.1, Release Date: 2010-01-20                       |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
sage: sage: R.<x> = QQ[]
sage: sage: f = 3/2*x - 1/3
sage: sage: %time _ = f % f
Error: unable to alloc/realloc memory
/opt/sage-4.3.1-archlinux-32bit-i686-Linux/local/bin/sage-sage: line 206: 17772 Aborted                 (core dumped) sage-ipython "$@" -i

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:47

I am sorry for only getting on with this today. Just now I tried to re-produce the problem in plain C with FLINT, but needless to say, I didn't manage. A simple-minded reproduction of the relevant code executed in no time and without any problems. Currently, I am completely out of ideas on how to fix this problem. That's why I'll raise the issue on sage-devel and conclude this message with a description of the behaviour that I experience on my machine (Lenovo T500 laptop, Intel Core2 Duo CPU, Ubuntu 9.10):

After applying all patches from this ticket to a 4.3.1.rc0 installation, modify the else block from line 882 in sage/libs/flint/fmpq_poly_linkage.pxi to the following:

            print "In case 3B"
            t = fmpz_init(limbs)
            print "den_fit_limbs"
            __fmpq_poly_den_fit_limbs(r, limbs + fmpz_size(a.den))
            print "pow_ui"
            fmpz_pow_ui(t, lead, m)
            print "mul"
            fmpz_mul(r.den, t, a.den)
            print "clear"
            fmpz_clear(t)

This only includes the print commands.

Then, upon executing the same sequence of commands that produce the crash in Alex' previous message, I receive the following output:

sage: R.<x> = QQ[]
sage: f = 3/2*x - 1/3
sage: %time _ = f % f
In case 3B
den_fit_limbs
pow_ui
mul
clear
CPU times: user 19.10 s, sys: 0.54 s, total: 19.64 s
Wall time: 19.72 s

What I find very strange (besides the fact that this takes 20s to obtain the correct result) is that there are two very noticeable delays of a couple of seconds, one after the output "pow_ui", and another after the output of "mul".

Sebastian

malb commented 14 years ago

comment:48

Did you do a sanity checks on the inputs to each function? E.g. what is limbs, What is r? What does fmpz_size(a.den) return? and so forth? Alex error message complains about some allocation not working so I'd assume some wrong size (negative or unitialised?) is passed to FLINT?

5689d088-9d81-41e2-b555-2341cea5bc21 commented 14 years ago

comment:49

Hi Martin,

Thank you for joining in! I've just done this now and posted on sage-devel. The problem doesn't seem to be in fmpz_pow_ui or fmpz_mul, but in fmpz_poly_pseudo_divrem. Well, I am not sure it's a problem in FLINT, since I couldn't reproduce it in plain C yet, but's related to that bit in the code rather than the later calls.

Sebastian

sagemath / sage

Implement QQ['x'] via Flint ZZ['x'] + denominator #4000