wangtongada / gmpy

Automatically exported from code.google.com/p/gmpy
GNU Lesser General Public License v3.0
0 stars 0 forks source link

Avoid unnecessary coercion to mpz #25

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
My understanding of the code is that gmpy always coerces a Python int to a
gmpy.mpz instance in a binary operation with gmpy.mpz. I believe avoiding
this (i.e. using custom coercion code and reading the value of a Python int
directly) would lead to significant performance improvements e.g. for some
loops in mpmath where one operand is gmpy.mpz and the other is int. One can
then also avoid constructing a temporary mpz_t by replacing the mpz_op call
with mpz_op_si or mpz_op_ui.

Example operations:

x += 1
x >>= 1
x *= 2
x //= 2

Executing the above sequence, with x = mpz(3<<53), if the right hand sides
are replaced by precomputed mpz instances (bypassing coercion), I get a
1.3x speedup and 1.9x speedup with psyco.

Original issue reported on code.google.com by fredrik....@gmail.com on 29 Apr 2009 at 8:12

GoogleCodeExporter commented 8 years ago
Am I correct that this would involve moving from "old style" to "new style" 
types?

Original comment by fredrik....@gmail.com on 3 May 2009 at 10:02

GoogleCodeExporter commented 8 years ago
I will look at streamlining the coercion code. Since Python 3.x only supports 
the
long type, the benefits may be limited to Python 2.x. 

Original comment by casevh on 4 May 2009 at 5:47

GoogleCodeExporter commented 8 years ago
True, though there might still be a small speedup using PyLong_AsLong since 
small
arguments are more likely. Especially for << and >> and **, since a
non-machine-precision value likely is an error.

Unfortunately PyLong_AsLong wastes time on things like an extra null and 
PyLong_Check
so it probably pays off to write a custom PyLong -> long function (since gmpy 
already
has custom conversion code in long2mpz this shouldn't make things much more 
complex).
It could be made really simple by just handling v->ob_size is -1, 0 or 1.

Original comment by fredrik....@gmail.com on 4 May 2009 at 6:17

GoogleCodeExporter commented 8 years ago
The automatic coercion to mpz was an artifact of Python not support PEP 208. 
Since
Python 3.x does not support the old coercion model, I've needed to change the 
number
conversion model. Now that I've made those changes, I can look at optimizing 
these cases.

BTW, the current trunk is slightly faster for some simple operations. I did 
have one
version of the trunk successfully run mpmath. The basic operations were faster 
but
overall it was slower. I don't yet understand why.

Original comment by casevh on 8 Jun 2009 at 7:29

GoogleCodeExporter commented 8 years ago
Did you benchmark all operations? As far as the mpmath tests go, I think 
mpz*int,
mpz//int, mpz>>int have the largest effect on performance.

Original comment by fredrik....@gmail.com on 8 Jun 2009 at 9:44

GoogleCodeExporter commented 8 years ago
I have optimized the argument processing for left and right shift. I haven't
optimized the other functions yet. I've stayed with the C API and the same 
source
works with both 2.x and 3.x. I used the following timeit command:

py31 -m timeit -s "import gmpy; a=gmpy.mpz(12345678901234567890); 
bb=gmpy.mpz(17);
b=17" "a>>b"

I tested "a>>17", "a>>b", and "a>>bb".

python 2.6, gmpy 1.04: .166 usec, .187 usec, .130 usec
python 2.6, gmpy r114: .122 usec, .132 usec, .144 usec
python 3.1, gmpy r114: .101 usec, .101 usec, .093 usec
python 2.6, gmpy4(*):  .111 usec, .121 usec, .172 usec

(*) gmpy4 is the Cython-based wrapper by Mario.

Original comment by casevh on 9 Jun 2009 at 5:21

GoogleCodeExporter commented 8 years ago
The current gmpy trunk (r122) runs the mpmath 0.11 runtests about 2.5% faster 
that
gmpy 1.04.  I found the memory leak that was causing that was causing the 
slowdown
earlier. I did need need to patch mpmath in a couple of places to to deal with 
the
change in number coercion. I think that will go away if I make 'mpz' + 'float' 
return
an 'mpf'.

I still have lots of optimizations to do so I think I can get more performance
improvements.

And r122 runs fine on Python 3.x.

Original comment by casevh on 15 Jun 2009 at 7:10

GoogleCodeExporter commented 8 years ago
As an experiment, I optimized addition. mpz + 1 is ~40% faster. (.15 usec vs. 
.09 
usec). This looks promising. :)

Original comment by casevh on 16 Jun 2009 at 3:28

GoogleCodeExporter commented 8 years ago
Smalll numbers are now automatically recognized for +, -, /, //, *, <<, and >>. 
There
should be an alpha release of gmpy 1.10 soon.

Original comment by casevh on 5 Jul 2009 at 5:19