Closed rwst closed 9 years ago
Author: Ralf Stephan
Description changed:
---
+++
@@ -1,3 +1,20 @@
People have no idea what they ask for when they write `if ex != 0:` with `ex` possibly(!) a symbolic expression. E.g. this happens to trigger in `matrix/*` the full blown procedure of starting Maxima with its limited deductive armament trying to prove that `ex` is nonzero. This when all was wanted was that it not be a "numeric" zero.
-This ticket replaces the two instances in `matrix/` where such behaviour was triggered in doctests. Speedups are expected.
+This ticket replaces the two instances in `matrix/` where such behaviour was triggered in doctests.
+
+Compare on fresh Sage:
+Before:
+
+```
+sage: %timeit M = matrix(SR, 2, 2, x^2 - 2*x + 1)
+The slowest run took 468.13 times longer than the fastest. This could mean that an intermediate result is being cached
+1 loops, best of 3: 2.08 ms per loop
+```
+After:
+
+```
+sage: %timeit M = matrix(SR, 2, 2, x^2 - 2*x + 1)
+The slowest run took 815.87 times longer than the fastest. This could mean that an intermediate result is being cached
+10000 loops, best of 3: 82.8 µs per loop
+```
+
This is also a commit of #18980 because without #18979 the improved decision logic in Pynac would lead to doctest fails. So if #18980 is merged first this ticket will be duplicate.
Hi,
I really do not like your solution. You are invading a generic class with some specialized code. Moreover, the issue you found might be present in a lot of other places. You should find a better way to do this test without having to care whether the input is symbolic. If each ring needs a specialized treatment it will become a complete nightmare.
Secondly, the operations import XYZ
and isinstance(x, my_type)
are not at all negligeable. Though, in this case it is not very noticeable since the cost is elsewhere. But with your strategy, you are potentially slowing down everthing for a corner case speed up.
In this very particular case of matrices, there is a simpler way to fix the issue. You can just avoid testing anything. In case x
is zero a useless filling will be done but of no harm (it is very quick compared to the rest of the initialization). If you like better my solution, write at least a comment that we should not test wheter the input is zero.
Vincent
Replying to @videlec:
You should find a better way to do this test without having to care whether the input is symbolic. If each ring needs a specialized treatment it will become a complete nightmare.
Agreed. The easiest way would be bool(x1!=x2)
to mean not (x1-x2).is_trivial_zero()
for symbolic x
and to provide a different interface for cases requiring simplification/proof. This would have some rerpercussions in symbolic and will take long to review so...
In this very particular case of matrices, there is a simpler way to fix the issue. You can just avoid testing anything.
Will do that.
EDIT: not zero
Omitting the checks however makes hundreds of doctests fail with eg...
Failed example:
matrix(RR, 0, 2, []).columns()
Exception raised:
Traceback (most recent call last):
File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
self.compile_and_execute(example, compiler, test.globs)
File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
exec(compiled, globs)
File "<doctest sage.matrix.matrix1.Matrix.columns[2]>", line 1, in <module>
matrix(RR, Integer(0), Integer(2), []).columns()
File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/constructor.py", line 729, in _matrix_constructor
return matrix_space.MatrixSpace(ring, nrows, ncols, sparse=sparse)(entries)
File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/matrix_space.py", line 482, in __call__
return self.matrix(entries, coerce, copy)
File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/matrix_space.py", line 1334, in matrix
return self.zero_matrix().__copy__()
File "sage/misc/cachefunc.pyx", line 2215, in sage.misc.cachefunc.CachedMethodCallerNoArgs.__call__ (build/cythonized/sage/misc/cachefunc.c:13792)
self.cache = f(self._instance)
File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/matrix_space.py", line 1185, in zero_matrix
res = self.__matrix_class(self, 0, coerce=False, copy=False)
File "sage/matrix/matrix_generic_dense.pyx", line 118, in sage.matrix.matrix_generic_dense.Matrix_generic_dense.__init__ (build/cythonized/sage/matrix/matrix_generic_dense.c:3057)
raise TypeError, "nonzero scalar matrix must be square"
TypeError: nonzero scalar matrix must be square
For the symbolic solution see #19040. The question is if that makes this ticket a duplicate, or if your idea of simply removing checks can be implemented.
Replying to @rwst:
For the symbolic solution see #19040. The question is if that makes this ticket a duplicate, or if your idea of simply removing checks can be implemented.
I was wrong. Removing the check would work only for square matrices. You want to allow initialization from zero for any matrix (because the matrix space form a vector space). But for other scalars the matrices need to be square in order to form a ring.
Secondly, the operations
import XYZ
andisinstance(x, my_type)
are not at all negligeable. Though, in this case it is not very noticeable since the cost is elsewhere. But with your strategy, you are potentially slowing down everthing for a corner case speed up.
This is nonsense. Look at how many instance checks you have in Matrix_generic_dense.__init__
. Lots of special cases. I have no idea why you would try to avoid this one here.
Changed branch from u/rws/avoid_maxima_on_creation_of_symbolic_matrices to u/rws/18979
This patch has the following footprint:
SR Laurent poly
develop 2 ms 45 us
patched 78 us 52 us
I don't believe you would stall this again because of, say, ten microseconds?
New commits:
2c4fa61 | 18979: avoid Maxima on creation of symbolic matrices |
I think invading a fundamental piece of infrastructure with specific tests for a specific ring should not be taken lightly. I think you can avoid most of your trouble by delaying the is_zero test to the last possible moment:
elif self._nrows == self._ncols:
self._entries = [entries if i==j else zero for i in range(self.nrows) for j in range(self.nrows)]
elif entries == zero:
self._entries = [zero]*(self._nrows*self._ncols)
else:
raise TypeError("nonzero scalar matrix must be square")
The first list comprehension can be optimized further, if required (I suspect that testing whether "entries" is zero to more quickly create a square zero matrix is a false optimization: if you need that thing quickly often, you should reuse an immutable zero matrix. That will be much quicker (and constant in size of the matrix!), so I don't think it's necessary).
If the zero check is expensive, it will almost certainly lead to an error.
With this code there is no big need to single out SR anymore, and you fix a more fundamental issue: in order to create a matrix (even a diagonal square one) it shouldn't be necessary that your ring has a decidable (or efficient) zero test. For a non-square zero matrix, we can't avoid a zero check, but at least we can delay that until we really need to know (because otherwise we raise an error).
Changed branch from u/rws/18979 to u/nbruin/18979
Changed author from Ralf Stephan to Nils Bruin
OK, I'll put my code where my mouth is.
Previous branch u/rws/18979 should still be present if it's preferred to resolve this ticket on the solution proposed there.
New commits:
257c0df | trac 18979: avoid zero-testing in matrix initialization until we really have to. |
As I said in comment:5, the solution should be non intrusive and your solution is what I suggested at the very end of this comment.
Though, using a list extension is twice slower. With
def m1():
cdef Py_ssize_t i
cdef list l = [0] * 100
for i in range(10): l[10*i + i] = 12
return l
def m2():
cdef Py_ssize_t i,j
return [12 if i == j else 0 for i in range(10) for j in range(10)]
I got
sage: %runfile test.pyx
sage: %timeit l = m1()
1000000 loops, best of 3: 456 ns per loop
sage: %timeit l = m2()
1000000 loops, best of 3: 1.09 µs per loop
Replying to @videlec:
As I said in comment:5, the solution should be non intrusive and your solution is what I suggested at the very end of this comment.
Though, using a list extension is twice slower. With
def m1(): cdef Py_ssize_t i cdef list l = [0] * 100 for i in range(10): l[10*i + i] = 12 return l def m2(): cdef Py_ssize_t i,j return [12 if i == j else 0 for i in range(10) for j in range(10)]
I got
sage: %runfile test.pyx sage: %timeit l = m1() 1000000 loops, best of 3: 456 ns per loop sage: %timeit l = m2() 1000000 loops, best of 3: 1.09 µs per loop
OK, go ahead and replace the loop by whatever works better. I hoped cython would be able to optimize this loop, but apparently it cannot. As your example shows, we're likely better off writing the diagonal twice. If this absolutely needs to be fast, we can kick down to a PyList_New
and PyList_SET_ITEM
inside a c-style loop. Do we need that? Probably if you need a diagonal matrix fast you should some something sparse.
The only thing that testing for zero got us in the square case is that for a zero matrix, we can avoid reinitializing the diagonal. I think we can do without that optimization, so then at least the problem for SR and other rings without (easily) testable equality are mostly fine except for situations that will most likely be an error anyway.
Branch pushed to git repo; I updated commit sha1. New commits:
e42f285 | trac 18979: apparently cython's list comprehension optimization isn't so good for lists of predictable length. |
Thanks for the changes. Note that your solution does not change anything to the problem mentioned in the description of this ticket!
Vincent
Replying to @videlec:
Note that your solution does not change anything to the problem mentioned in the description of this ticket!
Well, it solves the problem for the example mentioned: for square matrices and a single ring element as initialization it doesn't compare it to 0 anymore, so for square matrices, maxima is now avoided.
I think comparison in SR is smart enough that an explicit 0 compares equal to 0 without punting to maxima. So maxima should only be hit if we have a non-square matrix and the argument is not obviously 0. In that case it probably isn't zero, and in any case the user deserves what they have coming to them then.
In general, I think indeed that the generic matrix code shouldn't assume equality- or zero-testing is particularly cheap, and hence avoid it unless it's really necessary. The change here is a step in the right direction and basically has the same effect of what rws was trying to do on his branch.
Reviewer: Vincent Delecroix
Replying to @nbruin:
Replying to @videlec:
Note that your solution does not change anything to the problem mentioned in the description of this ticket!
Well, it solves the problem for the example mentioned: for square matrices and a single ring element as initialization it doesn't compare it to 0 anymore, so for square matrices, maxima is now avoided.
Right. Sorry, I misunderstood the branching of the code!
Replying to @nbruin:
In general, I think indeed that the generic matrix code shouldn't assume equality- or zero-testing is particularly cheap, and hence avoid it unless it's really necessary. The change here is a step in the right direction and basically has the same effect of what rws was trying to do on his branch.
Right, but for sparse matrices it is explicitely assumed that only nonzero entries are stored... what should we do in that case?
Replying to @videlec:
Right, but for sparse matrices it is explicitely assumed that only nonzero entries are stored... what should we do in that case?
Different problem, different ticket. Of course, one should only avoid comparisons if one can.
Since matrices rarely "accidentally" remain sparse, you probably get decent performance by just assuming any entry that is actually given, is nonzero (so no explicit checks necessary). Of course that doesn't work when converting a dense matrix to a sparse one.
If this is an issue, perhaps we should have a special "comparison avoiding" sparse matrix class.
In any case #19040 will make this all obsolete.
Changed branch from u/nbruin/18979 to e42f285
People have no idea what they ask for when they write
if ex != 0:
withex
possibly(!) a symbolic expression. E.g. this happens to trigger inmatrix/*
the full blown procedure of starting Maxima with its limited deductive armament trying to prove thatex
is nonzero. This when all was wanted was that it not be a "numeric" zero.This ticket replaces the two instances in
matrix/
where such behaviour was triggered in doctests.Compare on fresh Sage: Before:
After:
Component: linear algebra
Author: Nils Bruin
Branch/Commit:
e42f285
Reviewer: Vincent Delecroix
Issue created by migration from https://trac.sagemath.org/ticket/18979