sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.34k stars 453 forks source link

Faster echelon form code for matrix_modn_sparse #10733

Open tornaria opened 13 years ago

tornaria commented 13 years ago

In the method _echelon_in_place_classical of matrix_modn_sparse there's this comment:

TODO: Implement switching to a dense method when the matrix gets
      dense.

That's what we will do in this ticket, by switching to linbox echelon form code when rows start to get non-dense.

Depends on #10734

Component: linear algebra

Author: Gonzalo Tornaría, William Stein

Reviewer: William Stein, Gonzalo Tornaría

Issue created by migration from https://trac.sagemath.org/ticket/10733

tornaria commented 13 years ago
comment:1

NOTE: The patches in this ticket require #10734 to be applied first.

williamstein commented 13 years ago
comment:3

There are no doctests to illustrate the new functionality added by this patch.

tornaria commented 13 years ago
comment:4

The patch doesn't add any new functionality.

The function didn't have a doctest before.

I've attached a patch (to be applied in top of the other two) which adds a test for the sparse echelon form. The test takes about 20 seconds, so it's labeled as long.

What the test does is to construct a 1000x1000 sparse matrix with density=1%, and compute the echelon form using different values for "switch_density", and also using the dense matrix echelon form method. Of course everything should give the same echelon form. I'm not sure if it makes sense to test with smaller matrices.

In any case, feel free to add more tests or modify the current one in any way you see fits.

tornaria commented 13 years ago
comment:5

I replaced the patch by a new version. In the new version, I copied the test twice, so it's always run for 100x100 matrices (5% density) and optionally (long) run for 1000x1000 matrices (1% density).

The test for 100x100 matrices takes about 0.1 s, so it's ok. I think it's still a useful test since it runs the algorithm switching to dense at several different columns.

The test for 1000x1000 takes about 25 s (I changed the modulus from 97 to 997) and it's still optional (long). Is that too long for a "long" test?

tornaria commented 13 years ago
comment:6

PS: 500x500 with 1% density takes about 5s, so we may want to replace the 1000x1000 test by a 500x500 test (and still make it long time optional).

williamstein commented 13 years ago

Description changed:

--- 
+++ 
@@ -5,3 +5,5 @@
       dense.

That's what we will do in this ticket, by switching to linbox echelon form code when rows start to get non-dense. + +DEPENDS ON: #10734

williamstein commented 13 years ago
comment:8

REVIEW:

Annoyingly, there are issues when the modulus is 2, e.g.:

sage -t  sage/coding/linear_code.py
**********************************************************************
File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/devel/sage-main/sage/coding/linear_code.py", line 1523:
    sage: C.is_permutation_automorphism(g)
Exception raised:
    Traceback (most recent call last):
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_31[10]>", line 1, in <module>
        C.is_permutation_automorphism(g)###line 1523:
    sage: C.is_permutation_automorphism(g)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/lib/python/site-packages/sage/coding/linear_code.py", line 1532, in is_permutation_automorphism
        HGm = H*g.matrix()
      File "element.pyx", line 2282, in sage.structure.element.Matrix.__mul__ (sage/structure/element.c:15874)
      File "coerce.pyx", line 709, in sage.structure.coerce.CoercionModel_cache_maps.bin_op (sage/structure/coerce.c:6368)
      File "action.pyx", line 144, in sage.matrix.action.MatrixMatrixAction._call_ (sage/matrix/action.c:2818)
      File "matrix_modn_sparse.pyx", line 683, in sage.matrix.matrix_modn_sparse.Matrix_modn_sparse.dense_matrix (sage/matrix/matrix_modn_sparse.c:11767)
    TypeError: Cannot convert sage.matrix.matrix_mod2_dense.Matrix_mod2_dense to sage.matrix.matrix_modn_dense.Matrix_modn_dense
**********************************************************************

and

sage -t  sage/homology/cell_complex.py
**********************************************************************
File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/devel/sage-main/sage/homology/cell_complex.py", line 456:
    sage: P.homology(base_ring=GF(2))
Exception raised:
    Traceback (most recent call last):
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_15[4]>", line 1, in <module>
        P.homology(base_ring=GF(Integer(2)))###line 456:
    sage: P.homology(base_ring=GF(2))
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/lib/python/site-packages/sage/homology/cell_complex.py", line 552, in homology
        dimensions=dims, **kwds)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/lib/python/site-packages/sage/homology/delta_complex.py", line 662, in chain_complex
        return ChainComplex(data=differentials, degree=-1, **kwds)
      File "/tmp/wstein/sage-4.7.alpha3-sage.math.washington.edu-x86_64-Linux/local/lib/python/site-packages/sage/homology/chain_complex.py", line 462, in __init__
        raise TypeError, "The differentials d_{%s} and d_{%s} are not compatible: their product is not defined." % (n, n+degree)
    TypeError: The differentials d_{1} and d_{0} are not compatible: their product is not defined.

The result of running the whole test suite:

Here is another style comment. In Cython now doing

for c from 0 <= c < self._ncols:

is identical to doing:

for c in range(self._ncols):

Otherwise, everything looks great, and this is going to be extremeley useful for my research!!

williamstein commented 13 years ago
comment:10

Oh, here are the files with failures:


----------------------------------------------------------------------

The following tests failed:

    sage -t  devel/sage-main/sage/modular/modsym/ambient.py # 4 doctests failed
    sage -t  devel/sage-main/sage/matrix/matrix_space.py # 1 doctests failed
    sage -t  devel/sage-main/sage/coding/linear_code.py # 2 doctests failed
    sage -t  devel/sage-main/sage/rings/polynomial/pbori.pyx # 1 doctests failed
    sage -t  devel/sage-main/sage/homology/cell_complex.py # 2 doctests failed
    sage -t  devel/sage-main/sage/homology/examples.py # 2 doctests failed
    sage -t  devel/sage-main/sage/homology/delta_complex.py # 3 doctests failed
----------------------------------------------------------------------
Timings have been updated.
Total time for all tests: 328.6 seconds

It's all p=2 weirdness, I suspect.

williamstein commented 13 years ago
comment:11

Comment: I haven't run the full test suite yet -- will report back later...

williamstein commented 13 years ago
comment:12

Test suite run and it fully passes for me. Gonzalo (say) needs to review patch 4 and then this can get a positive review.

tornaria commented 13 years ago
comment:13

I reviewed the new patch and ran the test suite, which passes for me as well.

The problematic method was Matrix_modn_sparse.set_block_unsafe(..., Matrix_modn_dense m) which is meant to set a block of a sparse matrix given by a dense matrix. But for mod 2, the dense matrices are implemented by a different class (Matrix_mod2_dense) so calls to set_block_unsafe will fail. What the reviewer patch does is to factor out a common superclass for both Matrix_mod_dense which can be used in the implementation of set_block_unsafe.

I noticed that computing echelon form for a sparse mod2 matrix is much slower than converting the matrix to dense and using echelon form (which uses M4RI and is quite fast). Thus, it could make sense to reimplement sparse mod2 echelon form via conversion to dense. However, note that conversion from mod2 dense to mod2 sparse is quite slow currently.

In spite of the above comment, I'm giving positive review to patch 4 based on:

tornaria commented 13 years ago
comment:14

For the record, this is what I think could be used for mod2 sparse via dense:

   dense = self.dense_matrix()
   dense.echelonize()
   self.set_block_unsafe(0, 0, dense)

The last step does the conversion from dense to sparse, and it's way faster than calling dense.matrix_sparse(), but it's still slow.

tornaria commented 13 years ago
comment:15

I added two more patches to the sequence:

The latter basically implements what I suggested in my last comment as part of the _echelon_in_place_classical() method. As is now, the default for "switch_density" is always 0.1 so this code would not be normally used. But if one explicitly uses "switch_density=0", then the method will run faster after this patch. And for the case of mod 2, it seems that it could indeed be faster to use switch_density=0 instead of the default 0.1.

I guess these two patches need to go back to review, although trac doesn't give me the option of changing the status to needs_review...

tornaria commented 13 years ago
comment:16

Note that the "switch_density=0" case is already tested by the doctests so there's no need to add another doctest after patch 06.

williamstein commented 13 years ago
comment:17

Positive review (on the two extra little patches added, hence everything)!

jdemeyer commented 13 years ago
comment:19

Moving the milestone since this depends on a non-positively-reviewed ticket.

jdemeyer commented 13 years ago

Dependencies: #10734

jdemeyer commented 13 years ago

Description changed:

--- 
+++ 
@@ -5,5 +5,3 @@
       dense.

That's what we will do in this ticket, by switching to linbox echelon form code when rows start to get non-dense.

-DEPENDS ON: #10734

tornaria commented 13 years ago

Attachment: trac_10733.01-echelon_better_verbose.gz

Improve verbose logging of echelon_in_place_classical

tornaria commented 13 years ago

Make matrix modn sparse echelon form way faster by switching to dense linbox at a good time when the matrix gets too dense

tornaria commented 13 years ago

Attachment: trac_10733.02-echelon_make_faster.gz

Attachment: trac_10733.03-documentation.gz

Add a test for matrix_modn_sparse._echelon_in_place_classical

tornaria commented 13 years ago

Attachment: trac_10733.04-referee_refactor_to_support_p2.2.patch.gz

refactor code to support p=2 case (part 2)

tornaria commented 13 years ago

Attachment: trac_10733.05-doctest_mod2.patch.gz

Add testing for echelonize of mod2 sparse matrices

tornaria commented 13 years ago

Attachment: trac_10733.06-fastpath_for_strictly_dense_method.patch.gz

Implement a fast path for strictly using a dense method (could be useful for mod2)

tornaria commented 13 years ago
comment:22

I've reuploaded all the patches with the right format (see #10734).

Also, I moved part of the referee patch to #10734, otherwise some doctests would fail if that ticket is merged without the current one.

There's no new code at all, so I assume the positive review is still valid.


OTOH, I screwed up filenames because (a) I made a mistake uploading one patch (b) I don't have permissions to delete incorrectly uploaded patches (c) I don't have permissions to replace the referee patch.

Whoever has the rights, what needs to be done is: a. delete attachment trac_10733.2.03-documentation (uploaded by mistake) b. delete the original referee patch trac_10733.04-referee_refactor_to_support_p2.patch, and replace it by the new trac_10733.04-referee_refactor_to_support_p2.2.patch

After those two changes, the six patches are applied one after other in numerical order.

tornaria commented 13 years ago
comment:23

apply trac_10733.01-echelon_better_verbose, trac_10733.02-echelon_make_faster, trac_10733.03-documentation, trac_10733.04-referee_refactor_to_support_p2.2.patch, trac_10733.05-doctest_mod2.patch, trac_10733.06-fastpath_for_strictly_dense_method.patch

jdemeyer commented 12 years ago
comment:24

Please fill in your real name as Reviewer.

vbraun commented 10 years ago

Changed author from Gonzalo Tornaria to Gonzalo Tornaría, William Stein

vbraun commented 10 years ago

Reviewer: William Stein, Gonzalo Tornaría

vbraun commented 9 years ago
comment:26

I'm setting this to needs works since its not a git branch.