sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.45k stars 482 forks source link

Speed up constructing high-dimensional Euclidean spaces #30061

Closed mkoeppe closed 4 years ago

mkoeppe commented 4 years ago

The n-dimensional Euclidean space is available in Sage in many variants.

This ticket brings the speed of the most basic operation (constructing the space) of EuclideanSpace (from sage-manifolds) closer to that of the other variants.

Spaces without scalar product:

sage: VectorSpace(QQ, 5).category()
Category of finite dimensional vector spaces with basis over (number fields and quotient fields and metric spaces)
sage: %time VectorSpace(QQ, 5)
CPU times: user 28 µs, sys: 9 µs, total: 37 µs
Wall time: 40.1 µs
Vector space of dimension 5 over Rational Field
sage: %time VectorSpace(QQ, 80)
CPU times: user 218 µs, sys: 1 µs, total: 219 µs
Wall time: 223 µs
Vector space of dimension 80 over Rational Field
sage: %time VectorSpace(QQ, 1000)
CPU times: user 208 µs, sys: 1 µs, total: 209 µs
Wall time: 213 µs
Vector space of dimension 1000 over Rational Field

sage: for n in 5, 80, 1000, 4000, 10000, 100000: print("n = {}".format(n)); print("Construction: ", timeit("V = VectorSpace(QQ, {})".format(n),number=1,repeat=1)); u = [ x for x i
....: n range(n) ]; v = [ x + 1 for x in range(n) ]; V = VectorSpace(QQ, n); print("Distance:     ", timeit("norm(V(u) - V(v))"))
n = 5
Construction:  1 loop, best of 1: 56.1 ms per loop
Distance:      625 loops, best of 3: 81.7 μs per loop
n = 80
Construction:  1 loop, best of 1: 159 μs per loop
Distance:      625 loops, best of 3: 299 μs per loop
n = 1000
Construction:  1 loop, best of 1: 236 μs per loop
Distance:      125 loops, best of 3: 3.18 ms per loop
n = 4000
Construction:  1 loop, best of 1: 147 μs per loop
Distance:      25 loops, best of 3: 11.3 ms per loop
n = 10000
Construction:  1 loop, best of 1: 149 μs per loop
Distance:      25 loops, best of 3: 28.7 ms per loop
n = 100000
Construction:  1 loop, best of 1: 162 μs per loop
Distance:      5 loops, best of 3: 296 ms per loop
sage: CombinatorialFreeModule(QQ, range(5)).category()
Category of finite dimensional vector spaces with basis over Rational Field
sage: %time CombinatorialFreeModule(QQ, range(5))
CPU times: user 80 µs, sys: 0 ns, total: 80 µs
Wall time: 86.1 µs
Free module generated by {0, 1, 2, 3, 4} over Rational Field
sage: %time CombinatorialFreeModule(QQ, range(80))
CPU times: user 244 µs, sys: 10 µs, total: 254 µs
Wall time: 259 µs
Free module generated by {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79} over Rational Field
sage: %time CombinatorialFreeModule(QQ, range(1000))
CPU times: user 322 µs, sys: 26 µs, total: 348 µs
Wall time: 354 µs
sage: AffineSpace(RR, 5).category()
Category of schemes over Real Field with 53 bits of precision

sage: %time AffineSpace(RR, 5)
CPU times: user 47 µs, sys: 1 µs, total: 48 µs
Wall time: 51 µs
Affine Space of dimension 5 over Real Field with 53 bits of precision
sage: %time AffineSpace(RR, 80)
CPU times: user 2 ms, sys: 39 µs, total: 2.04 ms
Wall time: 2.04 ms
Affine Space of dimension 80 over Real Field with 53 bits of precision
sage: %time AffineSpace(RR, 1000)
CPU times: user 62.8 ms, sys: 1.45 ms, total: 64.3 ms
Wall time: 63.5 ms
Affine Space of dimension 1000 over Real Field with 53 bits of precision

Spaces with scalar product:

sage: VectorSpace(QQ, 5, inner_product_matrix=matrix.identity(5)).category()
Category of finite dimensional vector spaces with basis over (number fields and quotient fields and metric spaces)
sage: %time VectorSpace(QQ, 5, inner_product_matrix=matrix.identity(5))
CPU times: user 6.55 ms, sys: 666 µs, total: 7.22 ms
Wall time: 6.88 ms
Ambient quadratic space of dimension 5 over Rational Field
Inner product matrix:
[1 0 0 0 0]
[0 1 0 0 0]
[0 0 1 0 0]
[0 0 0 1 0]
[0 0 0 0 1]
sage: %time VectorSpace(QQ, 80, inner_product_matrix=matrix.identity(80))
CPU times: user 14.1 ms, sys: 439 µs, total: 14.6 ms
Wall time: 14.4 ms
Ambient quadratic space of dimension 80 over Rational Field
Inner product matrix:
sage: %time VectorSpace(QQ, 1000, inner_product_matrix=matrix.identity(1000))
CPU times: user 1.44 s, sys: 34.4 ms, total: 1.47 s
Wall time: 1.48 s
Ambient quadratic space of dimension 1000 over Rational Field
Inner product matrix: ...

sage: for n in 5, 80, 1000, 4000: print("n = {}".format(n)); print("Construction: ", timeit("V = VectorSpace(QQ, {}, inner_product_matrix=matrix.identity({}))".format(n, n),number=1,repeat=1)); u = [ x for x in range(n) ]; v = [ x + 1 for x in range(n) ]; V = VectorSpace(QQ, n, inner_product_matrix=matrix.identity(n)); print("Distance:     ", timeit("t = V(u)-V(v); sqrt(t.inner_product(t))"))
n = 5
Construction:  1 loop, best of 1: 61.9 ms per loop
Distance:      625 loops, best of 3: 49.4 μs per loop
n = 80
Construction:  1 loop, best of 1: 9.75 ms per loop
Distance:      625 loops, best of 3: 243 μs per loop
n = 1000
Construction:  1 loop, best of 1: 1.51 s per loop
Distance:      25 loops, best of 3: 14.4 ms per loop
n = 4000
Construction:  1 loop, best of 1: 23.8 s per loop
Distance:      5 loops, best of 3: 225 ms per loop

NB: It is not in the category MetricSpaces, and thus the element methods dist and abs (?!) are missing... The methods __abs__, norm, and dot_product are unrelated to the inner product matrix; only inner_product uses inner_product_matrix.

sage: EuclideanSpace(5).category()
Category of smooth manifolds over Real Field with 53 bits of precision

sage: %time EuclideanSpace(5)
CPU times: user 177 ms, sys: 10.4 ms, total: 187 ms
Wall time: 196 ms
5-dimensional Euclidean space E^5
sage: %time EuclideanSpace(80)
CPU times: user 32.8 s, sys: 758 ms, total: 33.6 s
Wall time: 31.5 s
80-dimensional Euclidean space E^80
sage: %time EuclideanSpace(1000)
(timeout)

With this ticket (and its dependencies #30065, #30074):

sage: %time EuclideanSpace(5)
CPU times: user 4.87 ms, sys: 372 µs, total: 5.24 ms
Wall time: 4.93 ms
5-dimensional Euclidean space E^5
sage: %time EuclideanSpace(80)
CPU times: user 106 ms, sys: 4.52 ms, total: 110 ms
Wall time: 92 ms
80-dimensional Euclidean space E^80
sage: %time EuclideanSpace(1000)
CPU times: user 1.04 s, sys: 33.5 ms, total: 1.07 s
Wall time: 986 ms

Some caching is happening too. The second time:

sage: %time EuclideanSpace(1000)
CPU times: user 207 ms, sys: 5.63 ms, total: 213 ms
Wall time: 212 ms
1000-dimensional Euclidean space E^1000

Scalar product without a space:

sage: %time DiagonalQuadraticForm(QQ, [1]*5)
CPU times: user 60 µs, sys: 1e+03 ns, total: 61 µs
Wall time: 62.9 µs
Quadratic form in 5 variables over Rational Field with coefficients: 
[ 1 0 0 0 0 ]
[ * 1 0 0 0 ]
[ * * 1 0 0 ]
[ * * * 1 0 ]
[ * * * * 1 ]
sage: %time DiagonalQuadraticForm(QQ, [1]*80)
CPU times: user 1.35 ms, sys: 31 µs, total: 1.39 ms
Wall time: 1.44 ms
Quadratic form in 80 variables over Rational Field with coefficients: 
sage: %time DiagonalQuadraticForm(QQ, [1]*1000)
CPU times: user 144 ms, sys: 2.85 ms, total: 147 ms
Wall time: 147 ms
Quadratic form in 1000 variables over Rational Field with coefficients: 

Depends on #30065 Depends on #30074

CC: @egourgoulhon @tscrim @nbruin @mwageringel @mjungmath

Component: geometry

Author: Eric Gourgoulhon

Branch/Commit: 3ce0c15

Reviewer: Matthias Koeppe

Issue created by migration from https://trac.sagemath.org/ticket/30061

mjungmath commented 4 years ago
comment:50

Can you please briefly summarize in the description what has actually been done and what the new benefits are? It is very difficult for me to piece that all together just from the comments and cross references to other tickets. Thanks! :)

In general, I would wait until the dependent tickets are merged. However, I am a bad role model due to my impatience.

egourgoulhon commented 4 years ago
comment:52

Replying to @mkoeppe:

Should we set this ticket to "needs_review"?

Yes, indeed.

egourgoulhon commented 4 years ago
comment:53

Replying to @mjungmath:

Can you please briefly summarize in the description what has actually been done and what the new benefits are? It is very difficult for me to piece that all together just from the comments and cross references to other tickets. Thanks! :)

The code added in this ticket is described in comment:22 and comment:23. The other code in the ticket branch arises from the dependencies tickets #30065 and #30074.

mjungmath commented 4 years ago
comment:54

Replying to @egourgoulhon:

The code added in this ticket is described in comment:22 and comment:23. The other code in the ticket branch arises from the dependencies tickets #30065 and #30074.

Thanks!

mkoeppe commented 4 years ago

Description changed:

--- 
+++ 
@@ -134,6 +134,22 @@
 (timeout)

+With this ticket (and its dependencies): + + +sage: %time EuclideanSpace(5) +CPU times: user 4.87 ms, sys: 372 µs, total: 5.24 ms +Wall time: 4.93 ms +5-dimensional Euclidean space E^5 +sage: %time EuclideanSpace(80) +CPU times: user 106 ms, sys: 4.52 ms, total: 110 ms +Wall time: 92 ms +80-dimensional Euclidean space E^80 +sage: %time EuclideanSpace(1000) +CPU times: user 1.04 s, sys: 33.5 ms, total: 1.07 s +Wall time: 986 ms + +

Scalar product without a space:

mkoeppe commented 4 years ago

Description changed:

--- 
+++ 
@@ -1,4 +1,6 @@
-The n-dimensional Euclidean space is available in Sage in many variants.  We investigate the speed of the most basic operation: Constructing the space.
+The n-dimensional Euclidean space is available in Sage in many variants.  
+
+This ticket brings the speed of the most basic operation (constructing the space) of `EuclideanSpace` (from sage-manifolds) closer to that of the other variants.

 **Spaces without scalar product:**
mkoeppe commented 4 years ago

Description changed:

--- 
+++ 
@@ -151,7 +151,14 @@
 CPU times: user 1.04 s, sys: 33.5 ms, total: 1.07 s
 Wall time: 986 ms

+Some caching is happening too. The second time:

+ +sage: %time EuclideanSpace(1000) +CPU times: user 207 ms, sys: 5.63 ms, total: 213 ms +Wall time: 212 ms +1000-dimensional Euclidean space E^1000 +

Scalar product without a space:

mkoeppe commented 4 years ago

Description changed:

--- 
+++ 
@@ -136,7 +136,7 @@
 (timeout)

-With this ticket (and its dependencies): +With this ticket (and its dependencies #30065, #30074):

 sage: %time EuclideanSpace(5)
mkoeppe commented 4 years ago

Reviewer: Matthias Koeppe

mjungmath commented 4 years ago
comment:61

Patchbot not green. The latest branch of #30074 has not been merged yet.

tscrim commented 4 years ago
comment:62

That is not a real error (the other patchbot also came back green). You don't need to have all of the other branches merged in.

mjungmath commented 4 years ago
comment:63

Replying to @tscrim:

That is not a real error (the other patchbot also came back green).

It is a real error. Pyflakes is complaining about an unused import.

You don't need to have all of the other branches merged in.

Why is that? Wouldn't that cause a merge conflict later on?

tscrim commented 4 years ago
comment:64

Replying to @mjungmath:

Replying to @tscrim:

That is not a real error (the other patchbot also came back green).

It is a real error. Pyflakes is complaining about an unused import.

No, it is not. Pyflakes are not real errors (but they are good to clear).

You don't need to have all of the other branches merged in.

Why is that? Wouldn't that cause in a merge conflict later on?

There is no need to add unnecessary extra merge commits when there is no conflict or a need to have the latest branch. If later changes on the first branch are done to unrelated parts to the changes on this branch, there is no conflict.

Addendum - The pyflakes is handled on the later commits from #30074. So changing that here would cause a merge conflict and resolve a problem already fixed.

mjungmath commented 4 years ago
comment:65

Replying to @tscrim:

Replying to @mjungmath:

Replying to @tscrim:

That is not a real error (the other patchbot also came back green).

It is a real error. Pyflakes is complaining about an unused import.

No, it is not. Pyflakes are not real errors (but they are good to clear).

Okay, I see what you mean.

Replying to @tscrim:

Addendum - The pyflakes is handled on the later commits from #30074. So changing that here would cause a merge conflict and resolve a problem already fixed.

Mh. Sorry, I don't get it. The #30074 will be merged into develop first because of the dependence, right? Afterwards, this ticket will be merged. If this ticket consists of an older version than #30074, this would cause a merge conflict, wouldn't it? Or, at least, the pyflakes error would be added again.

tscrim commented 4 years ago
comment:66

Replying to @mjungmath:

Replying to @tscrim:

Addendum - The pyflakes is handled on the later commits from #30074. So changing that here would cause a merge conflict and resolve a problem already fixed.

Mh. Sorry, I don't get it. The #30074 will be merged into develop first because of the dependence, right? Afterwards, this ticket will be merged. If this ticket consists of an older version than #30074, this would cause a merge conflict, wouldn't it? Or, at least, the pyflakes error would be added again.

No. The result will look like this:

B--0--X--T----M
    \        /
     0--0--H

where T is this branch and H is the head of #30074. So the #30074 branch will be merged first (all commits marked 0 starting from the latest beta B and H), then the commits from this branch (labelled X and T), which will result in the merge commit M. This would be exactly the same as if you merged in the latest commits from #30074 because that would be doing the same thing. The only difference is you have an extra merge commit in where this branch is pointing to.

mjungmath commented 4 years ago
comment:67

Replying to @tscrim:

No. The result will look like this:

B--0--X--T----M
    \        /
     0--0--H

where T is this branch and H is the head of #30074. So the #30074 branch will be merged first (all commits marked 0 starting from the latest beta B and H), then the commits from this branch (labelled X and T), which will result in the merge commit M. This would be exactly the same as if you merged in the latest commits from #30074 because that would be doing the same thing. The only difference is you have an extra merge commit in where this branch is pointing to.

Ah, I see. This is because the commit in #30074 is more recent, right?

tscrim commented 4 years ago
comment:68

Replying to @mjungmath:

Ah, I see. This is because the commit in #30074 is more recent, right?

Yes, but it is still based off a common base commit.

mjungmath commented 4 years ago
comment:69

Replying to @tscrim:

Replying to @mjungmath:

Ah, I see. This is because the commit in #30074 is more recent, right?

Yes, but it is still based off a common base commit.

Ah well. Thank you for explaining this, and sorry for delaying the process. This is good to know for future commits of mine.

vbraun commented 4 years ago

Changed branch from public/manifolds/init_module_basis-30061 to 3ce0c15