aleaxit / gmpy

General Multi-Precision arithmetic for Python 2.6+/3+ (GMP, MPIR, MPFR, MPC)
https://gmpy2.readthedocs.io/en/latest/
GNU Lesser General Public License v3.0
521 stars 86 forks source link

Support PyPy? #523

Open skirpichev opened 1 month ago

skirpichev commented 1 month ago

Performance of the CPyExt going better. Maybe we should add support for PyPy and run basic tests on CI to be sure it's working?

Here is a quick benchmark for multiplication (python -m pyperf timeit -s 'from gmpy2 import fac,mpz;a=fac(mpz(100))' 'a*a'): Benchmark ref ctypes cffi gmpy
timeit 781 ns 212 us: 271.14x slower 3.08 us: 3.95x slower 3.81 us: 4.88x slower

On CPython - gmpy2 looks 3.8x faster. It seems using C-API is still slower than CFFI, but not too much.

I think we can do this without too much special cases (see a draft patch below):

  1. some defines are missing (not sure why they turned off for PyPy in pythoncapi_compat.h);
  2. we should special case mpz <-> int conversions;
  3. some C-API functions are missing (PyContextVar_Reset so far: https://github.com/pypy/pypy/issues/5081).

On a pros, PyPy people are working hard to make it's support for C-API "correct first". Maybe it helps us catch some bugs. E.g. now without patching the gmpy2_cache.c I got for above benchmark:

RPython traceback:
  File "pypy_module_pypyjit.c", line 945, in portal_11
  File "pypy_interpreter_2.c", line 52468, in handle_bytecode__AccessDirect_None
  File "pypy_interpreter_3.c", line 33415, in dispatch_bytecode__AccessDirect_None
*** Invalid usage of a dying CPython object ***

cpyext, the emulation layer, detected that while it is calling
an object's tp_dealloc, the C code calls back a function that
tries to recreate the PyPy version of the object.  Usually it
means that tp_dealloc calls some general PyXxx() API.  It is
a dangerous and potentially buggy thing to do: even in CPython
the PyXxx() function could, in theory, cause a reference to the
object to be taken and stored somewhere, for an amount of time
exceeding tp_dealloc itself.  Afterwards, the object will be
freed, making that reference point to garbage.
>>> PyPy could contain some workaround to still work if
you are lucky, but it is not done so far; better fix the bug in
the CPython extension.
>>> This object is of type 'gmpy2.mpz'
Error when running timeit benchmark:

Statement:
'a*a'

Is this ok to manage cache in the tp_dealloc as we do?

simple diff to compile and get some things working: ```diff diff --git a/src/gmpy2.h b/src/gmpy2.h index 9195eeb5..4f862e20 100644 --- a/src/gmpy2.h +++ b/src/gmpy2.h @@ -71,6 +71,18 @@ extern "C" { # error "GMPY2 requires Python 3.8 or later." #endif +#ifdef PYPY_VERSION +# if SIZEOF_VOID_P >= 8 +# define PyHASH_BITS 61 +# else +# define PyHASH_BITS 31 +# endif +# define PyHASH_MULTIPLIER 1000003UL /* 0xf4243 */ +# define PyHASH_MODULUS (((size_t)1 << _PyHASH_BITS) - 1) +# define PyHASH_INF 314159 +# define PyHASH_IMAG PyHASH_MULTIPLIER +#endif + /* Include headers for GMP, MPFR, and MPC. */ #include diff --git a/src/gmpy2_cache.c b/src/gmpy2_cache.c index 93f1f459..3aabb338 100644 --- a/src/gmpy2_cache.c +++ b/src/gmpy2_cache.c @@ -179,12 +179,12 @@ GMPy_MPZ_NewInit(PyTypeObject *type, PyObject *args, PyObject *keywds) static void GMPy_MPZ_Dealloc(MPZ_Object *self) { - if (global.in_gmpympzcache < CACHE_SIZE && +/* if (global.in_gmpympzcache < CACHE_SIZE && self->z->_mp_alloc <= MAX_CACHE_MPZ_LIMBS) { global.gmpympzcache[(global.in_gmpympzcache)++] = self; } - else { + else */{ mpz_clear(self->z); PyObject_Free(self); } diff --git a/src/gmpy2_context.c b/src/gmpy2_context.c index fecd79af..e2912e2c 100644 --- a/src/gmpy2_context.c +++ b/src/gmpy2_context.c @@ -184,6 +184,7 @@ GMPy_CTXT_Enter(PyObject *self, PyObject *args) static PyObject * GMPy_CTXT_Exit(PyObject *self, PyObject *args) { +#ifndef PYPY_VERSION CTXT_Object *ctx = (CTXT_Object*)self; int res = PyContextVar_Reset(current_context_var, ctx->token); Py_DECREF(ctx->token); @@ -191,6 +192,7 @@ GMPy_CTXT_Exit(PyObject *self, PyObject *args) SYSTEM_ERROR("Unexpected failure in restoring context."); return NULL; } +#endif Py_RETURN_NONE; } #endif diff --git a/src/gmpy2_convert_gmp.c b/src/gmpy2_convert_gmp.c index 63a18e08..4fd4705f 100644 --- a/src/gmpy2_convert_gmp.c +++ b/src/gmpy2_convert_gmp.c @@ -42,6 +42,7 @@ static void mpz_set_PyLong(mpz_t z, PyObject *obj) { +#ifndef PYPY_VERSION int negative; Py_ssize_t len; PyLongObject *templong = (PyLongObject*)obj; @@ -66,6 +67,7 @@ mpz_set_PyLong(mpz_t z, PyObject *obj) mpz_neg(z, z); } return; +#endif } static MPZ_Object * @@ -131,7 +133,7 @@ GMPy_PyLong_From_MPZ(MPZ_Object *obj, CTXT_Object *context) if (mpz_fits_slong_p(obj->z)) { return PyLong_FromLong(mpz_get_si(obj->z)); } - +#ifndef PYPY_VERSION /* Assume gmp uses limbs as least as large as the builtin longs do */ size_t count, size = (mpz_sizeinbase(obj->z, 2) + @@ -151,8 +153,10 @@ GMPy_PyLong_From_MPZ(MPZ_Object *obj, CTXT_Object *context) GET_OB_DIGIT(result)[i] = 0; } _PyLong_SetSignAndDigitCount(result, mpz_sgn(obj->z) < 0, count); - return (PyObject*)result; +#else + Py_RETURN_NONE; +#endif } static PyObject * ```

PS: ctypes/cffi toy implementations

casevh commented 1 month ago

I agree with supporting PyPy. I also agree with their analysis of gmpy2_cache.c. The fix looks fine.

skirpichev commented 1 month ago

On Tue, Oct 15, 2024 at 11:07:49PM -0700, casevh wrote:

The fix looks fine.

Cerntaily no. I think this kill mpz caching.

casevh commented 1 month ago

Maybe I shouldn't review code when I'm exhausted. 😩

On Tue, Oct 15, 2024 at 11:19 PM Sergey B Kirpichev < @.***> wrote:

On Tue, Oct 15, 2024 at 11:07:49PM -0700, casevh wrote:

The fix looks fine.

Cerntaily no. I think this kill mpz caching.

— Reply to this email directly, view it on GitHub https://github.com/aleaxit/gmpy/issues/523#issuecomment-2415827653, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMR23ZDRRMMZZ4E4IUGMODZ3YANJAVCNFSM6AAAAABQASNIROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJVHAZDONRVGM . You are receiving this because you commented.Message ID: @.***>