sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 480 forks source link

use flint fmpq_mat for rational dense matrices #22970

Closed videlec closed 7 years ago

videlec commented 7 years ago

The flint library has a builtin type for rational matrices: fmpq_mat_t. This ticket proposes to replace the current array of mpq_t wrapped by Matrix_rational_dense in favor of fmpq_mat_t.

(See also the related #22924 and #22872)

CC: @ClementPernet @mmasdeu

Component: linear algebra

Keywords: thursdaysbdx

Author: Vincent Delecroix

Branch/Commit: cb0dae1

Reviewer: Marc Masdeu

Issue created by migration from https://trac.sagemath.org/ticket/22970

videlec commented 7 years ago

Branch: u/vdelecroix/22970

videlec commented 7 years ago

New commits:

f7d368f22970: more flint declarations
fce2da522970: pari / fmpq_mat_t conversions
b52e2c322970: implement libs/flint/randomize.pxd
d5ea72b22970: let rational dense matrix uses fmpq_mat_t
83e3a8722970: adapt codes using rational dense matrices
videlec commented 7 years ago

Commit: 83e3a87

videlec commented 7 years ago
comment:2

For a discussion about an annoying segfault related to this ticket, see this thread.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

3daa72222970: more flint declarations
26822a222970: nonzero versions of gmp randomize functions
769d86c22970: pari / fmpq_mat_t conversions
99cf14122970: let rational dense matrix uses fmpq_mat_t
a254a4322970: adapt codes using rational dense matrices
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 83e3a87 to a254a43

videlec commented 7 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,3 @@
-The flint library has a builtin type for rational matrices. This ticket proposes to replace the current array of `mpq_t` for the data to the `fmpq_mat_t` type of flint.
+The flint library has a builtin type for rational matrices: `fmpq_mat_t`. This ticket proposes to replace the current array of `mpq_t` wrapped by `Matrix_rational_dense` in favor of `fmpq_mat_t`.

 (See also the related #22924 and #22872)
videlec commented 7 years ago

Changed keywords from none to thursdaysbdx

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1621ac122970: use flint by default in det/inverse
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from a254a43 to 1621ac1

videlec commented 7 years ago

New commits:

3daa72222970: more flint declarations
26822a222970: nonzero versions of gmp randomize functions
769d86c22970: pari / fmpq_mat_t conversions
99cf14122970: let rational dense matrix uses fmpq_mat_t
a254a4322970: adapt codes using rational dense matrices
1621ac122970: use flint by default in det/inverse
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

867ed5629970: echelonize/echelon_form using flint
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 1621ac1 to 867ed56

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

d1c8d5229970: echelonize/echelon_form using flint
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 867ed56 to d1c8d52

mmasdeu commented 7 years ago
comment:11

Thanks for the work Vincent! I think that it's my duty to review this one :-).

While I update my develop branch and check the ticket out, could you check that that the docstring in the determinant method is correct? In algorithm, I see None missing.

Also, do you happen to have timings? that compare with the previous implementation? Otherwise I can produce some...

videlec commented 7 years ago
comment:12

Hi Marc,

Thanks for proposing the review! I need some more work on my branch as I have strange failing doctests in sage/modular that I don't understand right now.

I will provide proper timings for the following methods (tell me if you have more in mind):

videlec commented 7 years ago

Attachment: test_file.sage.gz

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

f01157c22970: more flint declarations
acce30c22970: nonzero versions of gmp randomize functions
be5ccb422970: pari / fmpq_mat_t conversions
9ba2c4d22970: let rational dense matrix uses fmpq_mat_t
150631d22970: adapt codes using rational dense matrices
a3b6b3722970: use flint by default in det/inverse
10f6d8c29970: echelonize/echelon_form using flint
aef538e22970: derandomize tests in free_module_integer.py
edfa6f422970: various cleaning in matrix_rational_dense
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from d1c8d52 to edfa6f4

videlec commented 7 years ago
comment:14

At commit edfa6f46b5 using attachment: test_file.sage I obtain (within different session because of caching)

sage: %runfile test_file.sage
sage: test(seed=0)
f = q + 1/2*a1*q^3 + (4*a1 - 26)*q^5 + O(q^6)
K = Number Field in a1 with defining polynomial x^2 + 32*x - 9984
a = 1/2*a1
True
False

and

sage: %runfile test_file.sage
sage: test(seed=1)
f = q + 1/2*a1*q^3 + (4*a1 - 26)*q^5 + O(q^6)
K = Number Field in a1 with defining polynomial x^2 + 32*x - 9984
a = 1/2*a1
False
True

And it looks like a bug to me that this result depends on the seed of the random generator...

videlec commented 7 years ago
comment:15

Replying to comment:14: Discussed at this sage-devel thread.

videlec commented 7 years ago
comment:16

Attachment: benchmark_matrices.py.gz

From attachment: benchmark_matrices.py

nrows=5  ncols=5  rank=5, density=1.0000 nbits(height)=100
flint        0.0565
multimodular 0.3284
padic        0.5626

nrows=10  ncols=10  rank=5, density=1.0000 nbits(height)=37
flint        0.1063
multimodular 2.6420
padic        7.6511

nrows=100  ncols=100  rank=20, density=0.1799 nbits(height)=12
flint        9.1834
multimodular 12.2192
padic        74.8152

nrows=10  ncols=10  rank=10, density=1.0000 nbits(height)=200
flint        10.1606
multimodular 1.5586
padic        2.1540
dim=3 density=0.7778 nbits(height)=3
flint        0.0092
integer      0.1334
pari         0.0017

dim=3 density=1.0000 nbits(height)=100
flint        0.0113
integer      0.1188
pari         0.0269

dim=30 density=1.0000 nbits(height)=5
flint        0.9006
integer      1.0754
pari         34.5452

dim=30 density=0.2544 nbits(height)=5
flint        0.4389
integer      0.9986
pari         179.9622
dim=3 density=0.7778 nbits(height)=2
flint        0.3875
linbox       1.3019
generic      0.0431

dim=4 density=0.5625 nbits(height)=2
flint        0.1776
linbox       1.2013
generic      0.0450

dim=5 density=0.6400 nbits(height)=2
flint        0.1737
linbox       1.2952
generic      0.0532

dim=6 density=0.6111 nbits(height)=2
flint        0.1850
linbox       1.2773
generic      0.0651

dim=10 density=0.4100 nbits(height)=2
flint        0.2343
linbox       1.4250
generic      0.1818

dim=10 density=1.0000 nbits(height)=2
flint        0.2521
linbox       1.4264
generic      0.4842

dim=30 density=1.0000 nbits(height)=2
flint        4.8531
linbox       3.4320
generic      50.0362
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

ba1787a22970: various cleaning in matrix_rational_dense
376146722970: fix doctests in sage/modular/
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from edfa6f4 to 3761467

videlec commented 7 years ago
comment:18

Needs review again. I tried to make independent of the random seed the annoying doctests in sage/modular...

videlec commented 7 years ago
comment:19

patchbot is green!! (with a weird ascii complaints)

mmasdeu commented 7 years ago
comment:20

On my computer, a call to

sage: CuspForms(101,12).hecke_matrix(3)

takes 765 seconds with the current develop branch, and 1157 seconds with the proposed patch. I think that most of it is explained by the fact that the method echelon_form is called 1394 times on quite large matrices (dense, over QQ). I was actually looking for speed-ups in existing high-level functionality (like this one), so this is a bit disappointing.

Maybe the default algorithm should be changed if the entries are large?

videlec commented 7 years ago
comment:21

Replying to @mmasdeu:

On my computer, a call to

sage: CuspForms(101,12).hecke_matrix(3)

takes 765 seconds with the current develop branch, and 1157 seconds with the proposed patch. I think that most of it is explained by the fact that the method echelon_form is called 1394 times on quite large matrices (dense, over QQ). I was actually looking for speed-ups in existing high-level functionality (like this one), so this is a bit disappointing.

Maybe the default algorithm should be changed if the entries are large?

I did not do proper tunings (since I will redo it after #22924 anyway). However, I can try to do something better for echelon_form/echelonize. Do you have other relevant examples in mind?

mmasdeu commented 7 years ago
comment:22

Not at the moment. I haven't done exhaustive tests either, but it seems that in many cases the performance is similar.

As for other examples, many operations with subquotients involve doing this kind of linear algebra.

videlec commented 7 years ago
comment:23

Some remarks

One way to go might be to have a global option that controls the default behavior of the echelon form. I am thinking of something that can be used like

def my_modular_form_algorithm():
    with rational_echelon_form('multimodular'):
        # some relevant code here
        ....

What do you think?

videlec commented 7 years ago
comment:24

Second option, allow the option algorithm=my_function where my_function takes a matrix as input and returns a valid algorithm for echelon form. (I do prefer this one better)

EDIT: this will not work since echelon_form is indirectly called in modular function algorithms...

videlec commented 7 years ago
comment:25

Implementing the idea from comment:23 I got

sage: %time m = CuspForms(31, 12).hecke_matrix(3)
CPU times: user 7.52 s, sys: 63.3 ms, total: 7.59 s
Wall time: 7.55 s

sage: %time c = CuspForms(51, 12).hecke_matrix(3)
CPU times: user 2min 13s, sys: 203 ms, total: 2min 13s
Wall time: 2min 13s

sage: %time c = CuspForms(101, 12).hecke_matrix(3)
CPU times: user 5min 10s, sys: 663 ms, total: 5min 10s
Wall time: 5min 10s

against the following in develop

sage: %time m = CuspForms(31, 12).hecke_matrix(3)
CPU times: user 13.5 s, sys: 43.3 ms, total: 13.6 s
Wall time: 13.5 s

sage: %time c = CuspForms(51, 12).hecke_matrix(3)
CPU times: user 3min 8s, sys: 250 ms, total: 3min 9s
Wall time: 3min 9s

%time c = CuspForms(101, 12).hecke_matrix(3)
CPU times: user 8min 7s, sys: 527 ms, total: 8min 8s
Wall time: 8min 8s

(so close to x2 improvement)

I propose to postpone this improvement to another ticket since this one is already a big change. If you agree I will open a new one and work on it as soon as this one is closed.

mmasdeu commented 7 years ago
comment:26

1) I agree that this can be dealt with in another ticket, but it is moderately urgent. ModularSymbols / ModularForms becoming twice slower than currently is not good news...

Another example of a reasonable calculation: with the Sage 7.5.1 the following calculation takes 22 seconds on my laptop. It takes 1min22s (!) with the proposed patch.

sage: %time A = ModularSymbols(11,60).hecke_matrix(3)

2) While reviewing, in sage/modular/hecke/module.py there is a doctest that changed in a suspicious way. It seems that a number field was created with a different generator and that causes the output to look different. I propose that the line that causes trouble gets changed into

sage: [o.minpoly() for o in A.system_of_eigenvalues(3)]
[x - 1, x + 1, x^2 - 2*x - 2]

The output that appears should not depend on the implementation.

3) There seems to be quite a mess on what are the acceptable values to the 'algorithm' parameter. Maybe it's a good time to change that! My suggestion would be that the default would always be None, and then the docstring would specify what is done in this case.

fredrik-johansson commented 7 years ago
comment:27

I might have an explanation for why Flint is slow for these modular symbol matrices. See:

https://github.com/wbhart/flint2/issues/335

It can be fixed in Flint (if my analysis is correct), but this requires some amount of coding.

videlec commented 7 years ago
comment:28

Replying to @fredrik-johansson:

I might have an explanation for why Flint is slow for these modular symbol matrices. See:

https://github.com/wbhart/flint2/issues/335

It can be fixed in Flint (if my analysis is correct), but this requires some amount of coding.

Would be tremendous!

videlec commented 7 years ago
comment:29

@mmasdeu: for the sake of this ticket I will set the default to multimodular that should actually speed up all modular symbol computations. Then #23026 will deal with making flint the default (that is much faster on random instances).

mmasdeu commented 7 years ago
comment:30

That sounds reasonable...at least there is no regression in speed (hopefully!).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

948cab122970: better doctest in modular/hecke/module.py
b5f560022970: improved doc + simpler echelonize/echelon_form logic
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 3761467 to b5f5600

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

14d31c522970: better doctest in modular/hecke/module.py
0c8b2ce22970: improved doc + simpler echelonize/echelon_form logic
c17206422970: _new_matrix/from_columns/add_to_entry
5f2e1ba22970: small optimization in modsym/ambient.py
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from b5f5600 to 5f2e1ba

videlec commented 7 years ago
comment:33

at commit b5f5600 I got

sage: %time m = CuspForms(31, 12).hecke_matrix(3)
CPU times: user 11.7 s, sys: 40 ms, total: 11.7 s
Wall time: 11.7 s

sage: %time c = CuspForms(51, 12).hecke_matrix(3)
CPU times: user 2min 52s, sys: 283 ms, total: 2min 52s
Wall time: 2min 52s

sage: %time c = CuspForms(101, 12).hecke_matrix(3)
CPU times: user 7min 17s, sys: 620 ms, total: 7min 18s
Wall time: 7min 18s

and at 5f2e1ba the even better

sage: %time m = CuspForms(31, 12).hecke_matrix(3)
CPU times: user 7.5 s, sys: 36.7 ms, total: 7.53 s
Wall time: 7.51 s

sage: %time c = CuspForms(51, 12).hecke_matrix(3)
CPU times: user 2min 10s, sys: 223 ms, total: 2min 10s
Wall time: 2min 10s

sage: %time c = CuspForms(101, 12).hecke_matrix(3)
CPU times: user 5min 9s, sys: 603 ms, total: 5min 10s
Wall time: 5min 10s
videlec commented 7 years ago
comment:35

And tests also pass on beta7!

mmasdeu commented 7 years ago
comment:36

This is definitely more than satisfying. The code looks very clean, and all tests pass. Thanks for the great work!

mmasdeu commented 7 years ago

Reviewer: Marc Masdeu

vbraun commented 7 years ago
comment:37

Merge conflict

videlec commented 7 years ago
comment:38

Too bad. The code is fine on beta7. Can I try to fix it by merging another ticket?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

86a305b22970: adapt codes using rational dense matrices
b5320ef22970: use flint by default in det/inverse
fad1b0d29970: echelonize/echelon_form using flint
b68993f22970: derandomize tests in free_module_integer.py
91c583522970: various cleaning in matrix_rational_dense
9f8391222970: fix doctests in sage/modular/
bcf6b7d22970: better doctest in modular/hecke/module.py
14f3fdb22970: improved doc + simpler echelonize/echelon_form logic
fceef5022970: _new_matrix/from_columns/add_to_entry
cb0dae122970: small optimization in modsym/ambient.py
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 5f2e1ba to cb0dae1