andyvand / gmpy

Automatically exported from code.google.com/p/gmpy
GNU Lesser General Public License v3.0
0 stars 0 forks source link

complex numbers are a bit slow #63

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm not sure it is right place to post this question but I don't know any other 
place.
I'm a bit concerned about the performance of the complex numbers in gmpy2. I 
did the following test:

import gmpy2 as gm
import numpy as np

a=np.random.rand(1000,1000)
A2=a*gm.mpfr('2')
Y2=a*gm.mpfr('5')
C2 = A2 + Y2 * gm.mpc('0+1j')

%timeit C2*C2
1 loops, best of 3: 1.97 s per loop

%timeit A2*A2-Y2*Y2; A2*Y2+Y2*A2
1 loops, best of 3: 1.84 s per loop

Why is the performance with complex numbers slower than one for float numbers? 
I would expect it to be other way round. Is it possible to optimise gmpy2.mpc ?

Original issue reported on code.google.com by D.Vutshi on 6 Dec 2012 at 4:47

GoogleCodeExporter commented 9 years ago
Hi, I ran a few tests and verified that gmpy2 isn't adding any significant 
overhead. The delay appears to be inherent in the MPC library. I'll write a C 
program to validate my test but it won't be for several days.

Original comment by casevh on 7 Dec 2012 at 5:33

GoogleCodeExporter commented 9 years ago
Can you test the latest source repository (r740)?. I just committed some 
changes that should make the basic operations a little faster.

Original comment by casevh on 7 Dec 2012 at 6:26

GoogleCodeExporter commented 9 years ago
In my tests r740 appeared to be faster than 2.0.0b2 which is very good! :). 
However, the relation between mpfr and mpc has been preserved.

%timeit C2*C2 
1 loops, best of 3: 1.82 s per loop
In [15]:

%timeit A2*A2-Y2*Y2; A2*Y2+Y2*A2
1 loops, best of 3: 1.78 s per loop

It could be just that the mpc library compiled on my system is slower than mpfr 
(I'm on mac os and I had some problems during compilation of the libraries).

Side question. Do you think the performance of gmpy2 with numpy arrays can be 
improved further or I need to use mpfr and mpc directly from cython to get a 
significant speed improvement?

Original comment by D.Vutshi on 7 Dec 2012 at 8:10

GoogleCodeExporter commented 9 years ago
r740 is indeed much faster than 2.0.0b2. My program with a lot of * and + with 
numpy arrays of mpc numbers works now 1.5 times faster!

Original comment by D.Vutshi on 7 Dec 2012 at 9:25

GoogleCodeExporter commented 9 years ago
At the MPC/MPFR level, a complex multiply will always be about 4x slower than a 
real multiply. All I can influence is the amount of overhead gmpy2 adds. I'm 
pleasantly surprised by the 1.5x improvement. I don't think I can reduce the 
overhead of gmpy2 any further without an unreasonable explosion in code size.

You can eliminate most of the overhead by accessing the MPC/MPFR libraries 
directly from cython. I don't know how much improvement there would be to 
cython with gmpy2.

What precision are you using? As the precision increases, the percentage 
overhead that Python and gmpy2 will decrease.

I'll let you know when I've had a chance to clean up the code from r740. Part 
of it was a little hacky. ;-)

Original comment by casevh on 7 Dec 2012 at 1:42

GoogleCodeExporter commented 9 years ago
Thank you for the explanations. Now I'm using precision from 100 to 200 bits.

Original comment by D.Vutshi on 7 Dec 2012 at 2:14

GoogleCodeExporter commented 9 years ago
Can you test r741? The code is a little cleaner and it may be slightly faster 
for basic operations (+ - * /) when both operands are either mpfr or mpc.

Original comment by casevh on 10 Dec 2012 at 5:44

GoogleCodeExporter commented 9 years ago
My tests show no change in multiplications of complex arrays:

%timeit C2*C2 #gmpy2 r741
1 loops, best of 3: 1.82 s per loop

%timeit C2*C2 #gmpy2 r740
1 loops, best of 3: 1.82 s per loop

small speed decrease in the simulation of complex by float:

%timeit A2*A2-Y2*Y2; A2*Y2+Y2*A2 #gmpy2 r741
1 loops, best of 3: 1.85 s per loop

%timeit A2*A2-Y2*Y2; A2*Y2+Y2*A2 #gmpy2 r740
1 loops, best of 3: 1.78 s per loop

The net effect on my program which mainly works with complex numbers is about 
0-1% speed improvement.

Original comment by D.Vutshi on 10 Dec 2012 at 9:10

GoogleCodeExporter commented 9 years ago
Thanks for testing. I didn't expect the difference to be very large but the 
code is cleaner. I think this is the best I can do.

Original comment by casevh on 10 Dec 2012 at 7:13

GoogleCodeExporter commented 9 years ago
Thanks a lot for the improvements. Apparently, the slow down for float numbers 
was a fluctuation, in later tests it shows 1.79 s per loop. The program is 
indeed a tiny bit faster now.

Original comment by D.Vutshi on 10 Dec 2012 at 9:01

GoogleCodeExporter commented 9 years ago
The performance enhancement and another important fix are included in beta3.

Original comment by casevh on 14 Dec 2012 at 7:18