sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.3k stars 447 forks source link

Faster qqbar operations using resultants #17886

Open 8d15854a-f726-4f6b-88e7-82ec1970fbba opened 9 years ago

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago

This is a spin-off from #16964 comment:31.

Many operations on algebraic numbers can become painfully slow. Most of these operations can be expressed in terms of resultants, and surprisingly the corresponding computations are sometimes way faster than what Sage currently does. So much faster that I'm not sure whether to consider this ticket here a request for enhancement, or even a defect.

Take for example the difference between two algebraic numbers r1 and r2, which are defined as follows:

sage: x = polygen(ZZ)
sage: p1 = x^5 + 6*x^4 - 42*x^3 - 142*x^2 + 467*x + 422
sage: p2 = p1((x-1)^2)
sage: r1 = QQbar.polynomial_root(p2, CIF(1, (2.1, 2.2)))
sage: r2 = QQbar.polynomial_root(p2, CIF(1, (2.8, 2.9)))

Computing their exact difference takes like forever:

sage: r4 = r1 - r2
sage: %time r4.exactify()
CPU times: user 2h 57min 1s, sys: 2.16 s, total: 2h 57min 3s
Wall time: 2h 57min 5s

On the other hand, computing a polynomial which has the difference as one root can be achieved fairly easily using resultants, and the resulting number is obtained in under one second:

sage: a,b = polygens(QQ, 'a,b')
sage: %time p3 = r1.minpoly()(a + b).resultant(r2.minpoly()(b), b)
CPU times: user 62 ms, sys: 0 ns, total: 62 ms
Wall time: 68 ms
sage: rs = [r for f in p3.factor()
....:       for r in f[0].univariate_polynomial().roots(QQbar, False)
....:       if r._value.overlaps(r1._value - r2._value)]
sage: assert len(rs) == 1
sage: r3, = rs
sage: %time r3.exactify()
CPU times: user 599 ms, sys: 0 ns, total: 599 ms
Wall time: 578 ms

One possible root of p3 is b=r2 and a+b=r1 which means a=r1-r2. So eliminating b we get a (reducible, not minimal) polynomial in a which has that difference as one of its roots. I try to identify that by looking at the roots r of the factors f, checking whether they overlap the numeric interval.

The way I understand the current code, most exact binary operations are implemented by exactifying both operands to number field elements, then constructing the union of both number fields, converting both operands to that and performing the operation in there. But there is no reason why the number field for the result should be able to contain the operands. I guess dropping that is the main reason why direct resultant computations are faster.

I propose that we try to build all binary operations on algebraic numbers on resultants instead of union fields. I furthermore propose that we try to build the equality comparison directly on resultants of two univariate polynomials, without bivariate intermediate steps.

I can think of two possible problems. One is that we might be dealing with a special case in the example above, and that perhaps number field unions are in general cheaper than resultants. Another possible problem I can imagine is that the resultant could factor into several distinct polynomials, some of which might share a root. If that were the case, numeric refinement wouldn't be able to help choosing the right factor. Should we perhaps not factor the resultant polynomial, but instead compute roots for the fully expanded form?

I'll try to come up with a branch which implements this approach.

CC: @mezzarobba @orlitzky

Component: number fields

Keywords: qqbar resultant exactify minpoly

Author: Martin von Gagern

Branch/Commit: u/gagern/ticket/17886 @ 12a1053

Issue created by migration from https://trac.sagemath.org/ticket/17886

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago

Branch: u/gagern/ticket/17886

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago

Commit: db11616

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago

Author: Martin von Gagern

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago
comment:2

OK, here is some work in progress, so you can see what I have in mind. This doesn't pass tests yet, so it will definitely require some more work.

The division r1/r2 for the example from the ticket description still takes extremely long. Surprisingly, the step which takes so long is not the computation of the resultant, its factors or its roots. No, it's the candidate.exactify() step which turns an ANRoot element into an ANExtensionElement. The do_polred in there is taking ages. Any suggestions how that might be avoided? It's “only” a degree 80 polynomial we're dealing with.


New commits:

db11616trac #17886: Compute binary operations using resultants.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from db11616 to 234b2c4

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

8d4b91bReturn a descriptor, not an algebraic number.
234b2c4Choose roots field based on approximation field.
8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago
comment:4

OK, the doctests look a lot better now. Mostly arbitrary choices made differently, like sign changes or using a different root as the reference generator, stuff like that. In several cases I obtain simpler results, i.e. polynomials of lower degree and the likes.

One thing that has me worried are cyclotomics. If both arguments are from cyclotomic fields, then we should do the union (which is fast in that case) instead of the minpoly and resultant. I haven't figured out how best to check for that case, though.

Another thing I fail is that test taken from that ARPREC paper. That example is really fast in current implementation, precisely because it's only operating in a single number field, so it doesn't really require any unions at all. Should we try to detect this fact, i.e. see if both arguments are either rational or elements of the same number field? My failure is only later on, where the original code somehow magically knows that the difference of two equal numbers is zero. I guess that if we did introduce a special case for equal number field, we might get that for free even though I don't know exactly how it works.

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago
comment:5

Including total CPU time for r4.exactify() using existing implementation.

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago

Description changed:

--- 
+++ 
@@ -18,7 +18,8 @@

sage: r4 = r1 - r2 sage: %time r4.exactify() -(still running, after more than half an hour) +CPU times: user 2h 57min 1s, sys: 2.16 s, total: 2h 57min 3s +Wall time: 2h 57min 5s


 On the other hand, computing a polynomial which has the difference as one root can be achieved fairly easily using resultants, and the resulting number is obtained in under one second:
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

12a1053Name myself in list of authors
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 234b2c4 to 12a1053

videlec commented 9 years ago
comment:7

Replying to @gagern:

One thing that has me worried are cyclotomics. If both arguments are from cyclotomic fields, then we should do the union (which is fast in that case) instead of the minpoly and resultant. I haven't figured out how best to check for that case, though.

For cyclotomics, I really think that we should use the universal cyclotomic field (and enhanced it):

sage: UCF = UniversalCyclotomicField()
sage: zeta3 = UCF.gen(3)
sage: zeta5 = UCF.gen(5)
sage: a = zeta3 + 2
sage: b = zeta5 + 1
sage: timeit("a*b")
625 loops, best of 3: 50.2 µs per loop
sage: a_QQbar = QQbar(a)
sage: b_QQbar = QQbar(b)
sage: timeit("a_QQbar*b_QQbar")
625 loops, best of 3: 13.2 µs per loop
sage: timeit("c = a_QQbar*b_QQbar; c.exactify()")
625 loops, best of 3: 774 µs per loop

Another thing I fail is that test taken from that ARPREC paper. That example is really fast in current implementation, precisely because it's only operating in a single number field, so it doesn't really require any unions at all. Should we try to detect this fact, i.e. see if both arguments are either rational or elements of the same number field? My failure is only later on, where the original code somehow magically knows that the difference of two equal numbers is zero. I guess that if we did introduce a special case for equal number field, we might get that for free even though I don't know exactly how it works.

Definitely. If the two elements have the same parent (i.e. QQ, a number field or UCF) we should perform the operation directly. Moreover, I hope to have something fast for comparisons in a given number field #17830 that would also speed up some comparisons in that case.

Vincent

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago
comment:8

Replying to @gagern:

The division r1/r2 for the example from the ticket description still takes extremely long. Surprisingly, the step which takes so long is not the computation of the resultant, its factors or its roots. No, it's the candidate.exactify() step which turns an ANRoot element into an ANExtensionElement. The do_polred in there is taking ages. Any suggestions how that might be avoided? It's “only” a degree 80 polynomial we're dealing with.

I compared this with Mathematica. Apart from the fact that I prefer Sage's way of presenting these numbers, I first couldn't get Mathematica to do elementary arithmetic on these two numbers at all. But using a different approach, I managed to do so. In Mathematica the division takes about twice as long as the other three operations, which makes sense considering that the resulting minimal polynomial has twice the degree. But it's only a distinction between 0.1 and 0.2 seconds, so there should be no mathematically unavoidable reason why Sage takes as long as it takes.

Should we try to avoid polred completely? I have the impression that this ticket here moves the focus away from “an algebraic number field extension and some polynomial in its generator” towards “a defining polynomial and an isolating interval”. The change is gradual, and we definitely want to keep the former aspect available if we want to simplify cases where all operations take place in the same number field. But since we no longer use unions for all operations, the nice and small description of the field generator appears to be less important. And the way I understand it, do_polred is responsible for finding such a nice and small generator. Should I try omitting that?

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 9 years ago
comment:9

Replying to @gagern:

The division r1/r2 for the example …

… can be used to exhibit several problems which are not directly related to arithmetic via resultants, so I filed separate tickets about these:

mezzarobba commented 9 years ago
comment:11

Regarding polred, see #15600.

videlec commented 8 years ago
comment:12

Note that #18356 proposed a better solution rather than resultants (via an algorithm from Bostan-Flajolet-Salvy-Schost). I guess we should make it a dependency of this ticket.

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 8 years ago
comment:13

Status update: #18356 has been merged by now (I hadn't noticed straight away), and #18242 comment:16 suggests incorporating these new functions into this ticket here. So I intend to incorporate those functions into my modifications as soon as I find the time. Also note #18333 for the big picture of things planned for QQbar.

orlitzky commented 3 years ago
comment:14

Well I guess I'm interested in this now that I'm trying to do something where addition/multiplication in AA is the bottleneck:

         27065425 function calls (27065055 primitive calls) in 53.210 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   53.216   53.216 {built-in method builtins.exec}
        1    0.000    0.000   53.215   53.215 <string>:1(<module>)
     19/1    0.000    0.000   53.215   53.215 unique_representation.py:992(__classcall__)
     19/1    0.000    0.000   53.215   53.215 {sage.misc.classcall_metaclass.typecall}
        1    0.000    0.000   53.215   53.215 eja_algebra.py:2568(__init__)
        1    0.010    0.010   52.715   52.715 eja_algebra.py:1572(__init__)
        2    0.038    0.019   52.699   26.350 eja_algebra.py:135(__init__)
  1923634    4.034    0.000   31.679    0.000 qqbar.py:5092(__init__)
  1923634    8.259    0.000   25.807    0.000 qqbar.py:3413(__init__)
   956610    3.154    0.000   23.851    0.000 qqbar.py:3626(_add_)
      240    2.260    0.009   23.787    0.099 eja_algebra.py:1779(jordan_product)
   963694    3.176    0.000   23.677    0.000 qqbar.py:3579(_mul_)
videlec commented 3 years ago
comment:15

I don't think you are. Addition/multiplication are O(1) in AA/QQbar (the elements are stored as expression trees). More efficient datastructures could be used but this is not what this ticket is about. Serious problems in AA/QQbar starts when you try to compare elements. This is what this ticket is about.

In your example you perform 1000000 binary operations. That is of course costly.

orlitzky commented 3 years ago
comment:16

Ok, I must have misunderstood =(

The same function over QQ completes almost instantly:

         123510 function calls (119803 primitive calls) in 0.444 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    402/1    0.001    0.000    0.451    0.451 {built-in method builtins.exec}
        1    0.000    0.000    0.451    0.451 <string>:1(<module>)
     92/1    0.000    0.000    0.451    0.451 unique_representation.py:992(__cl$
     92/1    0.000    0.000    0.451    0.451 {sage.misc.classcall_metaclass.ty$
        1    0.000    0.000    0.451    0.451 eja_algebra.py:2568(__init__)
       11    0.001    0.000    0.289    0.026 __init__.py:1(<module>)
        1    0.000    0.000    0.277    0.277 eja_algebra.py:1572(__init__)
        1    0.006    0.006    0.270    0.270 eja_algebra.py:135(__init__)
...

which is why I was optimistic that it could be improved over QQbar. (Maybe just by cythonizing it?) The underlying additions and multiplications are all from basic linear algebra. Products, inner products, coordinate computations (with respect to a basis), and so on for pairs of matrices.

videlec commented 3 years ago
comment:17

Sure there is no tree needed to represent the objects in QQ. Could you post a link to your code and the command you used for this profiling?

orlitzky commented 3 years ago
comment:18

This particular code is here:

http://gitweb.michael.orlitzky.com/?p=sage.d.git;a=tree;f=mjo/eja;hb=HEAD

With the "all" module imported, the two commands I profiled are

sage: %prun -s cumulative QuaternionHermitianEJA(3,field=AA,orthonormalize=False)
sage: %prun -s cumulative QuaternionHermitianEJA(3,field=QQ,orthonormalize=False)

with the only difference being the field parameter. Unfortunately, for user interface reasons, AA has to be the default.

mezzarobba commented 3 years ago
comment:19

@orlitzky: I didn't read the full discussion, but: if, in the application where you take elements of AA as input, you can dynamically determine a number field over which you can perform the whole computation, you should probably try doing that.

Otherwise, I suspect the best option in terms of value/effort ratio to speed up operations with algebraic numbers in Sage is now to create an alternative to the existing QQbar based on Calcium.

videlec commented 3 years ago
comment:20

Replying to @mezzarobba:

@orlitzky: I didn't read the full discussion, but: if, in the application where you take elements of AA as input, you can dynamically determine a number field over which you can perform the whole computation, you should probably try doing that.

+1. More precisely

sage: from sage.rings.qqbar import number_field_elements_from_algebraics
sage: l = [QQbar(2)**(1/2), QQbar(3)**(1/3) - QQbar(5)**(1/2)]
sage: number_field_elements_from_algebraics(l, minimal=True)
(Number Field in a with defining polynomial y^12 - 12*y^10 - 24*y^9 + 60*y^8 - 34*y^6 + 576*y^5 - 1164*y^4 - 1320*y^3 + 456*y^2 + 2448*y - 1151,
 [-16518995559/7799826643957*a^11 - 10435260375/7799826643957*a^10 + 187520914172/7799826643957*a^9 + 508862279484/7799826643957*a^8 - 1241854471899/15599653287914*a^7 - 218478193524/7799826643957*a^6 + 313750924974/7799826643957*a^5 - 9705313541079/7799826643957*a^4 + 13428594964320/7799826643957*a^3 + 28757542878738/7799826643957*a^2 + 16368959650563/15599653287914*a - 24370182357588/7799826643957,
  48080033585/15599653287914*a^11 + 92029987813/46798959863742*a^10 - 1654443694787/46798959863742*a^9 - 753537685652/7799826643957*a^8 + 3753942837309/31199306575828*a^7 + 1758351424153/23399479931871*a^6 - 496099642485/7799826643957*a^5 + 38328135347057/23399479931871*a^4 - 119433168311027/46798959863742*a^3 - 38969257820277/7799826643957*a^2 - 82694429544359/31199306575828*a + 156136708099345/23399479931871],
 Ring morphism:
   From: Number Field in a with defining polynomial y^12 - 12*y^10 - 24*y^9 + 60*y^8 - 34*y^6 + 576*y^5 - 1164*y^4 - 1320*y^3 + 456*y^2 + 2448*y - 1151
   To:   Algebraic Real Field
   Defn: a |--> 0.9193952626442228?)

Otherwise, I suspect the best option in terms of value/effort ratio to speed up operations with algebraic numbers in Sage is now to create an alternative to the existing QQbar based on Calcium.

That would be nice to see how it performs. The approach in Calcium is quite different.

orlitzky commented 3 years ago
comment:21

Thanks, I actually started out with that approach since all I need to construct my example algebras is sqrt(2). I ran into a lot of problems converting between number fields, though; and you can't choose one a priori that will work. I'd have issues like after taking the spectral decomposition, I'd get back eigenvalues in NumberField over NumberField over NumberField... over NumberField that couldn't be multiplied by an element of the QQ(sqrt(2)) that I started with. And theoretically, it's annoying that eigenvalues should live in the base field and they don't if you have to make the base field bigger to find them.

In short, I gave up on it in commit 98da0ce1d11020 and switched to AA "because everything cool requires it."

This isn't a life threatening problem yet, it's just annoying that my test suite is so slow. I appreciate the suggestions though.