aleaxit / gmpy

General Multi-Precision arithmetic for Python 2.6+/3+ (GMP, MPIR, MPFR, MPC)
https://gmpy2.readthedocs.io/en/latest/
GNU Lesser General Public License v3.0
518 stars 86 forks source link

Implement mpz.as_integer_ratio/to_bytes/from_bytes #366

Closed skirpichev closed 1 year ago

skirpichev commented 1 year ago

Should fix #357. Early draft version, please don't merge unless absolutely sure.

Some benchmarks are here. But note that for this size integers - arguments parsing (PyArg_ParseTupleAndKeywords branch) take ~1/3 of time. Without this branch:

$ python -m timeit -s 'from gmpy2 import fac' -s 'a = fac(57)' 'a.to_bytes(32)'
500000 loops, best of 5: 548 nsec per loop
casevh commented 1 year ago

I rarely use PyArg_ParseTupleAndKeywords due to the overhead. I'll try writing some custom argument parsing code. Even if it is only beneficial when keywords are not use, that should help the mode common use cases.

skirpichev commented 1 year ago

I rarely use PyArg_ParseTupleAndKeywords due to the overhead.

I will try to adapt PR, but keep in mind that the interface must be compatible with int's from/to_bytes. It's possible to do without (see Objects/clinic/longobject.c.h and Objects/longobject.c of the CPython sources tree) with much more complex code.

casevh commented 1 year ago

I did a quick test and it is not PyArg_ParseTupleAndKeywords. The PR code runs the benchmark in ~230ns. Hard-coding argument values to eliminate any parsing overhead reduces the time to ~180ns. But int.to_bytes only takes ~75ns.

int.to_bytes uses METH_FASTCALL. I wonder if that's why it so fast. I haven't experimented with METH_FASTCALL yet. My plan was to convert the _mpmath* functions first.

skirpichev commented 1 year ago

To use METH_FASTCALL we should drop support for older CPython versions (< 3.7). Yep, we did.

skirpichev commented 1 year ago

I've tested v2.1.2 with this patch on the CPython v3.6 (before conversion of int's methods to use METH_FASTCALL):

$ python -m timeit -s 'from gmpy2 import fac' -s 'a = fac(57)' 'a.to_bytes(32)'
1000000 loops, best of 3: 0.878 usec per loop
$ python -m timeit -s 'from math import factorial' -s 'a = factorial(57)' 'a.to_bytes(32, "big")'
1000000 loops, best of 3: 0.798 usec per loop