sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.42k stars 477 forks source link

Some changes to _echelon_in_place_classical to improve the speed #24123

Open tscrim opened 7 years ago

tscrim commented 7 years ago

We can get faster echelon forms by

  1. Directly calling the foo_c implementations instead of the Python foo methods.
  2. Not retesting if certain entires are 0.
  3. Only calling the get_unsafe once per loop and storing as a variable.

Component: performance

Author: Travis Scrimshaw

Branch/Commit: public/linear_algebra/some_speed_improvements-24123 @ aca06e9

Issue created by migration from https://trac.sagemath.org/ticket/24123

tscrim commented 7 years ago

Commit: aca06e9

tscrim commented 7 years ago
comment:1

With this branch:

sage: for _ in range(10):
....:     M = matrix({(2000-i,i): 1/i for i in range(1,2000,1)}, sparse=True)
....:     %time M.echelonize()
CPU times: user 112 ms, sys: 0 ns, total: 112 ms
Wall time: 81 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 81 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 81.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 79.2 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 81.1 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 82.6 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 80.6 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 81 ms
CPU times: user 84 ms, sys: 0 ns, total: 84 ms
Wall time: 80.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 80.6 ms

versus before

CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78.4 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 78.3 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 78.7 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 77.9 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 79.5 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 78.6 ms
CPU times: user 76 ms, sys: 0 ns, total: 76 ms
Wall time: 78.1 ms
CPU times: user 80 ms, sys: 0 ns, total: 80 ms
Wall time: 79.6 ms

I am confused. Admittedly my example is artificial because it was trying to demonstrate how checking if an entry was 0 twice was going to be bad. Removing 3 did not change the timings. I would have thought skipping checking if certain entries were 0 would have a speed gain at the cost of some minor code duplication. Maybe this ticket can just be closed or recycled...


New commits:

aca06e9Directly call the C functions and do not recheck for 0's.
tscrim commented 7 years ago

Branch: public/linear_algebra/some_speed_improvements-24123

tscrim commented 7 years ago

Description changed:

--- 
+++ 
@@ -1,5 +1,5 @@
 We can get faster echelon forms by

-- Directly calling the foo_c implementations instead of the Python foo methods.
-- Not retesting if certain entires are 0.
-
+1. Directly calling the foo_c implementations instead of the Python foo methods.
+2. Not retesting if certain entires are 0.
+3. Only calling the `get_unsafe` once per loop and storing as a variable.