simon-king-jena commented 7 years ago

Issues

Currently, sage.algebras.finite_dimensional_algebras.finite_dimensional_algebra_element is Python.
Multiplication is implemented by

self.__class__(self.parent(), self._vector * other._matrix)

which for some matrix implementations is not the fastest way:

sage: v = random_vector(GF(2),2000)
sage: M = random_matrix(GF(2),2000,2000)
sage: %timeit v*M
The slowest run took 8.27 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 4.96 ms per loop
sage: %timeit M*v
1000 loops, best of 3: 172 µs per loop
sage: v = random_vector(GF(3),2000)
sage: M = random_matrix(GF(3),2000,2000)
sage: %timeit v*M
The slowest run took 37.59 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.11 ms per loop
sage: %timeit M*v
1000 loops, best of 3: 974 µs per loop
sage: v = random_vector(QQ,2000)
sage: M = random_matrix(QQ,2000,2000)
sage: %timeit v*M
1 loop, best of 3: 1.51 s per loop
sage: %timeit M*v
1 loop, best of 3: 3.32 s per loop
sage: v = random_vector(GF(9,'x'),2000)
sage: M = random_matrix(GF(9,'x'),2000,2000)
sage: %timeit v*M
1 loop, best of 3: 163 ms per loop
sage: %timeit M*v
1 loop, best of 3: 166 ms per loop

(Remark: The timings for GF(9,'x') are with MeatAxe)

vectormatrix respectively matrixvector may be a lot slower than matrix*matrix:

sage: M = random_matrix(GF(9,'x'),2000,2000)
sage: v = random_matrix(GF(9,'x'),2000,1)
sage: %timeit M*v
10 loops, best of 3: 52.4 ms per loop
sage: v1 = v.transpose()
sage: %timeit v1*M
The slowest run took 37.36 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 14.6 µs per loop

The last example is 163 ms vs. 14.6 µs!!!

In my applications, I need finite dimensional path algebra quotients. The current implementation corresponds to a finite dimensional path algebra quotient where the underlying quiver has a single vertex.
Creation of an element also involves construction of its multiplication matrix. From my experience, it can be faster (in some applications) to not construct the matrix of it right away, but be lazy. E.g., when one computes Gröbner bases, only multiplication by power products of the algebra generators are needed, and then it is faster to map the defining vector of the element repeatedly by multiplying with the matrices of the generators, instead of doing matrix by matrix multiplications.

Suggestion

Re-implement it in Cython
Replace vectormatrix by either rowmatrix or matrixcolumn, being flexible enough that both implementations are supported (because it depends on the field whether rowmatrix or matrix*column is fastest).
Allow using a quiver with more than one vertices.
Be more lazy, i.e. do costly computations such as the construction of the multiplication matrix only when needed.

CC: @tscrim @fchapoton @videlec

Component: algebra

Keywords: finite dimensional, algebra, quiver

Author: Simon King

Branch/Commit: 3cc3a27

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/23707

simon-king-jena commented 7 years ago

Description changed:

--- 
+++ 
@@ -55,8 +55,11 @@

 - In my applications, I need finite dimensional path algebra quotients. The current implementation corresponds to a finite dimensional path algebra quotient where the underlying quiver has a single vertex.

+- Creation of an element also involves construction of its multiplication matrix. From my experience, it can be faster (in some applications) to not construct the matrix of it right away, but be lazy. E.g., when one computes Gröbner bases, only multiplication by power products of the algebra generators are needed, and then it is faster to map the defining vector of the element repeatedly by multiplying with the matrices of the generators, instead of doing matrix by matrix multiplications.
+
 __Suggestion__

 1. Re-implement it in Cython
 2. Replace vector*matrix by either row*matrix or matrix*column, being flexible enough that *both* implementations are supported (because it depends on the field whether row*matrix or matrix*column is fastest).
 3. Allow using a quiver with more than one vertices.
+4. Be more lazy, i.e. do costly computations such as the construction of the multiplication matrix only when needed.

jdemeyer commented 7 years ago

comment:2

FYI: PARI/GP has recently gained functionality to deal with general finite dimensional algebras. But it seems that this ticket is mostly about basic arithmetic, so I'm not sure that it's relevant.

simon-king-jena commented 7 years ago

comment:3

Replying to @jdemeyer:

FYI: PARI/GP has recently gained functionality to deal with general finite dimensional algebras. But it seems that this ticket is mostly about basic arithmetic, so I'm not sure that it's relevant.

This ticket is not necessarily about basic arithmetic. Eventually, I finally want to implement my F5 algorithm for modules over path algebra quotients (after being distracted by other things for several years), starting with finite dimensional quotients.

So, having an implementation of finite dimensional algebras over any field (but for me most importantly over finite not necessarily prime fields!) that is really competitive would be nice.

Can you give me a pointer for the PARI/GP implementation? Would it be difficult to wrap in cython?

jdemeyer commented 7 years ago

comment:4

Replying to @simon-king-jena:

So, having an implementation of finite dimensional algebras over any field (but for me most importantly over finite not necessarily prime fields!) that is really competitive would be nice.

An algebra over F_q is just a special case of an algebra over F_p. PARI/GP supports algebras over Q and over F_p.

Can you give me a pointer for the PARI/GP implementation?

See https://pari.math.u-bordeaux.fr/dochtml/html-stable/ section "Associative and central simple algebras".

Would it be difficult to wrap in cython?

I don't think so. Depending on your needs, you could also use cypari2.

simon-king-jena commented 7 years ago

comment:5

Replying to @jdemeyer:

Replying to @simon-king-jena:

So, having an implementation of finite dimensional algebras over any field (but for me most importantly over finite not necessarily prime fields!) that is really competitive would be nice.

An algebra over F_q is just a special case of an algebra over F_p.

I don't buy that. A matrix over F_p should be the same as an algebra over F_q, but the current default in sage differs totally in performance.

Anyway, if you wanted to say that PARI/GP is good in both cases then it's fine.

See https://pari.math.u-bordeaux.fr/dochtml/html-stable/ section "Associative and central simple algebras".

Thank you!

Would it be difficult to wrap in cython?

I don't think so. Depending on your needs, you could also use cypari2.

What's that?

jdemeyer commented 7 years ago

comment:6

cypari2 is a new package with what used to be sage.libs.pari. It is a Cython interface to PARI. It wraps PARI objects in Python objects. Using this is slower than directly calling the PARI library. On the other hand, if your computations spend most of the time in the PARI library, it wouldn't matter for performance.

simon-king-jena commented 7 years ago

comment:7

Replying to @jdemeyer:

cypari2 is a new package with what used to be sage.libs.pari. It is a Cython interface to PARI. It wraps PARI objects in Python objects. Using this is slower than directly calling the PARI library. On the other hand, if your computations spend most of the time in the PARI library, it wouldn't matter for performance.

Thanks. Are algebras wrapped? If found cypari2 by googling, but a quick look seems to indicate that the conversions are about numbers, not algebras.

simon-king-jena commented 7 years ago

comment:8

How can I do the following pari code in Sage?

  { m_i = [0,-1,0, 0;
           1, 0,0, 0;
           0, 0,0,-1;
           0, 0,1, 0];
    m_j = [0, 0,-1,0;
           0, 0, 0,1;
           1, 0, 0,0;
           0,-1, 0,0];
    m_k = [0, 0, 0, 0;
           0, 0,-1, 0;
           0, 1, 0, 0;
           1, 0, 0,-1];
    A = alginit(nfinit(y), [matid(4), m_i,m_j,m_k],  0); }

Defining the matrices works:

sage: Mi = pari("""
....: [0,-1,0, 0;
....:            1, 0,0, 0;
....:            0, 0,0,-1;
....:            0, 0,1, 0]""")
sage: Mj = pari("""
....: [0, 0,-1,0;
....:            0, 0, 0,1;
....:            1, 0, 0,0;
....:            0,-1, 0,0];""")
sage: Mk = pari("""
....: [0, 0, 0, 0;
....:            0, 0,-1, 0;
....:            0, 1, 0, 0;
....:            1, 0, 0,-1];""")
sage: Mj
[0, 0, -1, 0; 0, 0, 0, 1; 1, 0, 0, 0; 0, -1, 0, 0]
sage: Mi*Mk
[0, 0, 1, 0; 0, 0, 0, 0; -1, 0, 0, 1; 0, 1, 0, 0]

However, my attempts to define the algebra failed:

sage: pari("nfinit(y)").alginit([pari("matid(4)"), Mi,Mj,Mk], 0)
...
PariError: incorrect priority in nffactor: variable 0 >= y
sage: pari.alginit(pari("nfinit(y)"), [pari("matid(4)"), Mi,Mj,Mk], 0)
...
AttributeError: 'cypari2.pari_instance.Pari' object has no attribute 'alginit'

jdemeyer commented 7 years ago

comment:9

Replying to @simon-king-jena:

Are algebras wrapped?

Everything is wrapped. That's the nice thing about cypari2: it automagically wraps all GP functions (assuming that the argument types are understood; PARI functions taking functions as input are not supported for example).

It's probably not so relevant, but it's best to use #23518.

a quick look seems to indicate that the conversions are about numbers, not algebras.

Well yes, it's a Python package so it only supports a limited set of conversions for standard Python types. Sage does support more conversions though. In any case, don't be blinded by the conversions, that's really just a detail. If PARI supports your algorithms, surely the conversion between Sage and PARI should be possible to implement.

jdemeyer commented 7 years ago

comment:10

Use a different variable name :-)

sage: pari("nfinit(t)").alginit(["matid(4)", Mi, Mj, Mk], 0)

Note that you don't need pari() in the arguments of the method: the inputs are automatically converted to PARI. This means that strings are evaluated.

With #23518, you should be able to write this as

sage: pari.alginit("nfinit(t)", ["matid(4)", Mi, Mj, Mk], 0)

or

sage: pari.alginit(pari.nfinit("t"), [pari.matid(4), Mi, Mj, Mk], 0)

simon-king-jena commented 7 years ago

comment:11

Replying to @jdemeyer:

Use a different variable name :-)
sage: pari("nfinit(t)").alginit(["matid(4)", Mi, Mj, Mk], 0)

Why is that? pari("nfinit(y)") by itself does work.

simon-king-jena commented 7 years ago

comment:12

What's wrong with this?

sage: M1 = random_matrix(GF(2),1024,1024)
sage: M2 = random_matrix(GF(2),1024,1024)
sage: A = pari(GF(2)).alginit(["matid(1024)", M1, M2],0)
...
PariError: incorrect type in alginit (t_POL)

So, does the first argument of alginit really needs to be a number field, not a finite field?

simon-king-jena commented 7 years ago

comment:13

Reading that over finite fields one uses algtableinit, I tried

sage: MT = pari(["matid(1024)", M1,M2])
sage: A = MT.algtableinit(2)
...
PariError: incorrect type in algtableinit (t_VEC)

which didn't work either. Why?

simon-king-jena commented 7 years ago

comment:14

The following doesn't look promising:

sage: pM1 = pari(M1)
sage: pM2 = pari(M2)
sage: %timeit M1*M2
1000 loops, best of 3: 1.08 ms per loop
sage: %timeit pM1*pM2
1 loop, best of 3: 201 ms per loop

jdemeyer commented 7 years ago

comment:15

Replying to @simon-king-jena:

Reading that over finite fields one uses algtableinit, I tried
sage: MT = pari(["matid(1024)", M1,M2])
sage: A = MT.algtableinit(2)
...
PariError: incorrect type in algtableinit (t_VEC)
which didn't work either. Why?

Don't you need a list of 1024 matrices in dimension 1024?

simon-king-jena commented 7 years ago

comment:16

Replying to @jdemeyer:

Don't you need a list of 1024 matrices in dimension 1024?

Ouch, yes! Apparently I thought of constructing the algebra that is generated by these matrices (which I guess has dimension much larger than 1024).

simon-king-jena commented 7 years ago

comment:17

Replying to @jdemeyer:

Replying to @simon-king-jena:
Reading that over finite fields one uses algtableinit, I tried
sage: MT = pari(["matid(1024)", M1,M2])
sage: A = MT.algtableinit(2)
...
PariError: incorrect type in algtableinit (t_VEC)
which didn't work either. Why?
Don't you need a list of 1024 matrices in dimension 1024?

But in any case, Pari complains about type (t_VEC), not about value. So, the number of matrices isn't the only problem.

jdemeyer commented 7 years ago

comment:18

Replying to @simon-king-jena:

But in any case, Pari complains about type (t_VEC)

Don't take the PARI error messages too literally...

simon-king-jena commented 7 years ago

Branch: u/SimonKing/more_competitive_and_comprehensive_finite_dimensional_algebras

simon-king-jena commented 7 years ago

Commit: a34ee26

simon-king-jena commented 7 years ago

comment:20

I think I shouldn't be overambitious in this ticket: The finite-dimensional path algebra quotient stuff should be for a later ticket, where I will probably sub-class the now cythonised finite dimensional algebra elements.

Changes:

The element class is now in Cython.
The multiplication matrix of an element is now lazy, which means that initialisation of an element will be substantially faster.
An element is encoded by a matrix with a single row, not by a vector. As I have demonstrated, multiplying a single row matrix by a square matrix is always faster than multiplying a vector by a square matrix.

Caveat: It seems to me that OVER QQ multiplying a square matrix with a single-column matrix is faster than multiplying a single-row matrix with a square matrix. But that's the only case where I found that behaviour. Since I am not interested in the rational case anyway, I don't plan to provide a special implementation over QQ where elements are encoded as single-column matrices rather than single-row matrices.

PS: For me, tests pass...

New commits:

a34ee26 Faster implementation of finite dimensional algebras

simon-king-jena commented 7 years ago

comment:21

Back to pari:

sage: A = FiniteDimensionalAlgebra(GF(3), [Matrix([[1, 0], [0, 1]]), Matrix([[0, 1], [0, 0]])])
sage: pari([x.matrix() for x in A.gens()]).algtabaleinit()
Traceback (most recent call last):
...
PariError: incorrect type in algtableinit (t_VEC)

What's wrong?

simon-king-jena commented 7 years ago

comment:22

Replying to @simon-king-jena:

What's wrong?

I don't know why, but apparently Pari doesn't accept some associative algebras that are perfectly fine in Sage, while it accept others:

sage: B = FiniteDimensionalAlgebra(QQ, [Matrix([[1,0,0], [0,1,0], [0,0,1]]), Matrix([[0,0,0], [1,0,1], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [1,0,1]])])
sage: pari([x.matrix() for x in B.gens()]).algtableinit()
[0, 0, 0, 0, 0, 0, [1, 0, 0; 0, 1, 0; 0, 0, 1], [1, 0, 0; 0, 1, 0; 0, 0, 1], [[1, 0, 0; 0, 1, 0; 0, 0, 1], [0, 0, 0; 1, 0, 1; 0, 0, 0], [0, 0, 0; 0, 0, 0; 1, 0, 1]], 0, [3, 0, 1]]
sage: A = FiniteDimensionalAlgebra(QQ, [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: pari([x.matrix() for x in A.gens()]).algtableinit()
---------------------------------------------------------------------------
PariError                                 Traceback (most recent call last)
<ipython-input-31-0e66e23e7476> in <module>()
----> 1 pari([x.matrix() for x in A.gens()]).algtableinit()

cypari2/auto_gen.pxi in cypari2.gen.Gen_auto.algtableinit()

cypari2/handle_error.pyx in cypari2.handle_error._pari_err_handle()

PariError: incorrect type in algtableinit (t_VEC)

Is there a clear reason or are associative algebras in Pari just broken?

simon-king-jena commented 7 years ago

comment:23

Strangely, the patchbot doesn't test it (although it needs review). I'm trying to ?kick it (I don't know if that's still needed nowadays, I learned about ?kick a couple of years ago).

fchapoton commented 7 years ago

comment:24

you need to fill the Author field, otherwise it is deemed unsafe

fchapoton commented 7 years ago

Author: Simon King

simon-king-jena commented 7 years ago

comment:25

Replying to @fchapoton:

you need to fill the Author field, otherwise it is deemed unsafe

Oops, sorry! And thank you for filling it.

simon-king-jena commented 7 years ago

comment:26

The patchbot reports a failure in sage.geometry.polyhedron.backend_normaliz.Polyhedron_normaliz.integral_hull. Is that related? In any case, it seems that it just changes because of a different sorting of the result (and if it does involve finite dimensional algebras, the sorting of the output would change, because sorting is now defined and is compatible with sorting of matrices).

fchapoton commented 7 years ago

comment:27

23586

simon-king-jena commented 7 years ago

comment:28

I notice a problem: Old pickles won't unpickle.

When I save a finite dimensional algebra element in the old version, and load it in the new version, I get

sage: x = load('backup.sobj')
sage: x
<repr(<sage.algebras.finite_dimensional_algebras.finite_dimensional_algebra_element.FiniteDimensionalAlgebra_with_category.element_class at 0x7f836819c0b8>) failed: AttributeError: 'NoneType' object has no attribute 'list'>
sage: x._vector is None
True
sage: x.__dict__

{'_default_filename': '/home/king/Sage/git/sage/backup.sobj',
 '_matrix': [0 1 0]
 [0 0 0]
 [0 0 0],
 '_vector': (0, 1, 0)}
sage: type(x)
<class 'sage.algebras.finite_dimensional_algebras.finite_dimensional_algebra_element.FiniteDimensionalAlgebra_with_category.element_class'>
sage: type(x) is A.element_class
False
sage: A = FiniteDimensionalAlgebra(QQ, [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x = A.1
sage: type(x)
<type 'sage.algebras.finite_dimensional_algebras.finite_dimensional_algebra_element.FiniteDimensionalAlgebraElement'>

So, the old pickle gives the a python class that is derived from the new cython class and is not the same as the new element_class.

How can this be cured? There is unpickle_override, but I don't know if this helps in the current situation.

simon-king-jena commented 7 years ago

Work Issues: Old pickles

simon-king-jena commented 7 years ago

comment:29

Would it help to implement a __setstate__ method that handles assignment of the old data to the new cdef public slots?

tscrim commented 7 years ago

comment:30

IIRC, what I've done in the past when changing a Python class to a cdef class is to use register_unpickle_override. I think something else you can do is just create an essentially dummy Python class that does nothing other than be a subclass of the Cython class.

Also, something else to be aware of is checking you are inheriting things from the category. IIRC, when Cythonizing CombinatorialFreeModuleElement, Nicolas did some special workup to get category inheritance to work properly (my solution was the dummy Python class).

tscrim commented 7 years ago

comment:31

Also, I get similar behavior as mentioned in the ticket description for Z and Q:

sage: v = random_vector(R, 2000)
sage: M = random_matrix(R, 2000)
sage: %timeit M * v
sage: %timeit v * M
sage: %timeit matrix(v) * M
sage: %timeit M * matrix(v).transpose()
sage: vp = matrix(v)
sage: %timeit vp * M
sage: vp = vp.transpose()
sage: %timeit M * vp

yields for R = QQ:

1 loop, best of 3: 3.09 s per loop
1 loop, best of 3: 1.49 s per loop
1 loop, best of 3: 331 ms per loop
1 loop, best of 3: 210 ms per loop
1 loop, best of 3: 330 ms per loop
1 loop, best of 3: 211 ms per loop

and R = ZZ:

1 loop, best of 3: 657 ms per loop
10 loops, best of 3: 116 ms per loop
10 loops, best of 3: 60.1 ms per loop
10 loops, best of 3: 32.8 ms per loop
10 loops, best of 3: 59.6 ms per loop
10 loops, best of 3: 32.3 ms per loop

Interestingly for R = ZZ['x'] (with averaging 10 trials of 700x700 matrices):

1 loop, best of 3: 252 ms per loop
1 loop, best of 3: 197 ms per loop
1 loop, best of 3: 257 ms per loop
1 loop, best of 3: 210 ms per loop
1 loop, best of 3: 255 ms per loop
1 loop, best of 3: 212 ms per loop

Basically the multiplication of (dense) matrix * vector (and vice versa) is using the naïve algorithm and does not take advantage of specialized implementations:

        cdef Py_ssize_t i
        return sum([self.column(i, from_list=True) * v[i]
                    for i in xrange(self._ncols)], M(0))

While _matrix_times_vector_ clearly deserves specialized implementations (same for _vector_times_matrix_), it will take some more specialized code to improve it. The amazing thing is Q has a specialized implementation for _vector_times_matrix_, but it is not as fast as going back and forth with a matrix (flint is doing something really good)! O_o That would be a project for a separate ticket.

simon-king-jena commented 7 years ago

comment:32

__setstate__ will work.

Question: When unpickling, the dictionary contains an item _default_filename, which provides the file name of the pickle. Should I simply ignore it? I think so...

I agree that the vector vs. matrix problem should be treated separately. I think currently it is best to do "single row matrix" times "square matrix".

simon-king-jena commented 7 years ago

comment:33

PS: Apart from _default_filename, should other unknown items simply be ignored? Or better raise a TypeError when an unexpected property occurs at unpickling?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

3c39dac Unpickling old f.d. algebra elements

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from a34ee26 to 3c39dac

simon-king-jena commented 7 years ago

Changed work issues from Old pickles to none

simon-king-jena commented 7 years ago

comment:35

The test for the new __setstate__ method is indirect, but I tested that indeed a pickle created with the old version correctly unpickles in the new version (but: It results in a python class derived from FiniteDimensionalAlgebraElement, rather than in the cython class FiniteDimensionalAlgebraElement itself).

While I was at it, I introduced an unpickle function. Reasons: It is slightly faster than the previous way, and if the multiplication matrix has previously been computed then the matrix will be taken into account in the pickle (thus, avoiding a re-computation).

simon-king-jena commented 7 years ago

comment:36

PS: __setstate__ should in real applications only be invoked when unpickling old pickles. In that case, the resulting element will be a python class, thus, will have a __dict__. Then, the special cdef slots are filled and all remaining attributes will be put into __dict__.

simon-king-jena commented 7 years ago

comment:37

According to the title, this ticket is about "more competitive" implementation. So, here are some timings. Old implementation:

sage: A = FiniteDimensionalAlgebra(QQ, [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 172.77 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 81.9 µs per loop
sage: %timeit y+z
The slowest run took 8.10 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 66.4 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 15.88 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 348 µs per loop

New implementation:

sage: A = FiniteDimensionalAlgebra(QQ, [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 380.33 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.52 µs per loop
sage: %timeit y+z
The slowest run took 17.05 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.64 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 30.65 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 261 µs per loop

So, I kept my promise...

simon-king-jena commented 7 years ago

comment:38

Let's consider different fields (note that I am using meataxe, for GF(9,'x')).

Old:

sage: A = FiniteDimensionalAlgebra(GF(2), [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 95.24 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 96.3 µs per loop
sage: %timeit y+z
10000 loops, best of 3: 84.7 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 10.92 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 637 µs per loop
sage: A = FiniteDimensionalAlgebra(GF(3), [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 153.07 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 77 µs per loop
sage: %timeit y+z
The slowest run took 5.54 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 64.2 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 26.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 375 µs per loop
sage: A = FiniteDimensionalAlgebra(GF(9,'x'), [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 201.31 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 82.8 µs per loop
sage: %timeit y+z
The slowest run took 6.70 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 65.2 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 13.58 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 513 µs per loop

New:

sage: A = FiniteDimensionalAlgebra(GF(2), [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 162.81 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 17.9 µs per loop
sage: %timeit y+z
The slowest run took 19.10 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.13 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 21.34 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 553 µs per loop
sage: A = FiniteDimensionalAlgebra(GF(3), [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 183.38 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 13 µs per loop
sage: %timeit y+z
The slowest run took 17.76 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.99 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 7.77 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 270 µs per loop
sage: A = FiniteDimensionalAlgebra(GF(9,'x'), [Matrix([[1,0,0], [0,1,0], [0,0,0]]), Matrix([[0,1,0], [0,0,0], [0,0,0]]), Matrix([[0,0,0], [0,0,0], [0,0,1]])])
sage: x,y,z = A.gens()
sage: %timeit x*y
The slowest run took 150.55 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 19.3 µs per loop
sage: %timeit y+z
The slowest run took 16.32 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.92 µs per loop
sage: %timeit loads(dumps(x))
The slowest run took 25.80 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 434 µs per loop

So, it is consistently faster, apparently for all fields.

tscrim commented 7 years ago

comment:39

Great, thank you for the timings.

Some things that will need to be done before positive review:

You should have a pxd file with the appropriate declarations.
I would get rid of the is_vector and just replace it with the appropriate isinstance test for clarity (and the compiler does not always inline inline functions).
I would not mark unpickle_FiniteDimensionalAlgebraElement as inline as it probably never will be inlined.

In this part of the initialization

if A == elt.parent():
  self._vector = elt._vector.base_extend(k)
  self.__matrix = elt._matrix.base_extend(k)

I think you should not compute the matrix for elt, but instead keep it lazy:

if A == elt.parent():
 mat = (<FiniteDimensionalAlgebraElement> elt).__matrix
 if mat is None:
     self.__matrix = None
 else:
     self.__matrix = mat.base_extend(k)

You should generally avoid the matrix() constructor as it is slower than creating the corresponding MatrixSpace and directly constructing the element from that (at least last time I checked). This way avoids a lot of the logic for parsing arguments that is there in matrix(). So you could do changes like:
```
-self._vector = matrix(self._parent.base_ring(), 1,len(v), list(v))
+self._vector = MatrixSpace(self._parent.base_ring(), len(v)).one()
```
You should add a typecast here for speed:
```
-cdef FiniteDimensionalAlgebraElement a = ~self
+cdef FiniteDimensionalAlgebraElement a = <FiniteDimensionalAlgebraElement> (~self)
```
Also, on the line below you know that a.__matrix is set (if invertible), so you can avoid the extra Python call.
Add a typecase for other in _mul_.

In _matrix, you should add a cdef int i (Cython makes better code with this). Actually, better to do

cdef int i
cdef tuple table = <tuple> A.table()
ret = sum(self._vector[0,i] * table[i] for i in xrange(A.degree()))
self.__matrix = matrix(A.base_ring(), ret)

As Jeroen said on sage-devel, by having a __dict__, the class is slower than without it. I suspect there are probably better ways to deal with the unpickling than to have a __dict__. I can look into this if you want.

simon-king-jena commented 7 years ago

comment:40

Replying to @tscrim:

As Jeroen said on sage-devel, by having a __dict__, the class is slower than without it. I suspect there are probably better ways to deal with the unpickling than to have a __dict__. I can look into this if you want.

You misunderstand what I was doing.

We are talking here about the __setstate__ method, that will only be called when someone unpickles an old pickle of finite-dimensional algebra elements. Pickling of the new implementation relies on __reduce__ and an unpickle function that has nothing to do with __setstate__.

In the old implementation, an element was an instance of an element_class derived from a Python class. In other words, it is a dynamic class whose base is the former Python class FiniteDimensionalAlgebraElement. Hence, when unpickling it, the class will still be a dynamic (Python) class whose base is the (now!) Cython class FiniteDimensionalAlgebraElement.

Therefore, when unpickling an old pickle, we can be sure that the class has a __dict__, simply because it is a Python class! Nonetheless, just to be on the safe side, I check whether it really has a __dict__, and only if it has, I use it to store additional attributes that the user might have assigned to the element before pickling it.

In other words, FiniteDimensionalAlgebraElement has no __dict__, and we are merely talking about an attempt to recover as gracefully as possible in the unlikely event that the user has an old pickle containing additional attributes. But if that situation occurs, the class will be a derived Python class that has a __dict__ -- I am not adding it.

simon-king-jena commented 7 years ago

comment:41

Replying to @tscrim:

You should generally avoid the matrix() constructor as it is slower than creating the corresponding MatrixSpace and directly constructing the element from that (at least last time I checked).

I can confirm, although the difference isn't huge. Non-square matrix is faster with MatrixSpace:

sage: %timeit M = matrix(GF(7), 1,1000, range(1000))
The slowest run took 5.83 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 66.3 µs per loop
sage: %timeit M = MatrixSpace(GF(7), 1,1000)(range(1000))
The slowest run took 5.47 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 56.4 µs per loop

Square unit matrix (mind that we can not take .one(), as we want a mutable copy):

sage: %timeit M = matrix(GF(7), 1000,1000, 1)
1000 loops, best of 3: 738 µs per loop
sage: %timeit M = MatrixSpace(GF(7), 1000)(1)
The slowest run took 5.07 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 712 µs per loop

tscrim commented 7 years ago

comment:42

Replying to @simon-king-jena:

I see. I am not completely convinced that you want to rebuild the old class rather than the new class. The reason you may unpickle to a different parent is because of the UniqueFactory of GF(p) having the Sage version as part of the key. If you do it over, e.g., Z, then it ends up being two "different" element classes of the same parent. Now as soon as you do an operation, it goes to the new class. However, I am still willing to set a positive review since this is close enough to a pathological example and everything seems to work.

I am encountering a difference because of extension classes versus regular classes:

sage: A = FiniteDimensionalAlgebra(ZZ, [Matrix([[1,0], [0,1]]), Matrix([[0,1], [0,0]])])
sage: A.an_element()[0]  # TypeError - defined in the category and not inherited

So no matter what, you currently need a Python class for the element_class.

One way you can add a test is take an old pickle string and then load that (which is how I am testing things).

tscrim commented 7 years ago

Reviewer: Travis Scrimshaw

tscrim commented 7 years ago

comment:43

Replying to @simon-king-jena:

Replying to @tscrim:

You should generally avoid the matrix() constructor as it is slower than creating the corresponding MatrixSpace and directly constructing the element from that (at least last time I checked).

Square unit matrix (mind that we can not take .one(), as we want a mutable copy):

I don't see where you need a mutable copy. Can you show me where that is? Also, one() does some extra checks, but the fastest way is identity_matrix() (all MS(1) does is call MS.identity_matrix().__copy__()).

simon-king-jena commented 7 years ago

comment:44

Replying to @tscrim:

I don't see where you need a mutable copy.

I thought at some point of the code matrix entries are assigned to after creation, but it seems that it is in fact not the case. In that case, one should .zero() and .one(), as it is cached.

sagemath / sage

More competitive and comprehensive finite dimensional algebras #23707

23586