Open tscrim opened 7 years ago
With this branch:
sage: for _ in range(10):
....: M = matrix({(2000-i,i): 1/i for i in range(1,2000,1)}, sparse=True)
....: %time M.echelonize()
CPU times: user 112 ms, sys: 0 ns, total: 112 ms
Wall time: 81 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 81 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 81.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 79.2 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 81.1 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 82.6 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 80.6 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 81 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 80.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 80.6 ms
versus before
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78.4 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 78.3 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 78.7 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 77.9 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 79.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78.6 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 78.1 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 79.6 ms
I am confused. Admittedly my example is artificial because it was trying to demonstrate how checking if an entry was 0 twice was going to be bad. Removing 3 did not change the timings. I would have thought skipping checking if certain entries were 0 would have a speed gain at the cost of some minor code duplication. Maybe this ticket can just be closed or recycled...
New commits:
aca06e9 | Directly call the C functions and do not recheck for 0's. |
Description changed:
---
+++
@@ -1,5 +1,5 @@
We can get faster echelon forms by
-- Directly calling the foo_c implementations instead of the Python foo methods.
-- Not retesting if certain entires are 0.
-
+1. Directly calling the foo_c implementations instead of the Python foo methods.
+2. Not retesting if certain entires are 0.
+3. Only calling the `get_unsafe` once per loop and storing as a variable.
We can get faster echelon forms by
get_unsafe
once per loop and storing as a variable.Component: performance
Author: Travis Scrimshaw
Branch/Commit: public/linear_algebra/some_speed_improvements-24123 @
aca06e9
Issue created by migration from https://trac.sagemath.org/ticket/24123