lemire / fastrand

Fast random number generation in an interval in Python: Up to 10x faster than random.randint.
Apache License 2.0
88 stars 12 forks source link

pcg32bounded argument check #4

Closed achan001 closed 6 years ago

achan001 commented 6 years ago

To match pcg32bounded() doc-string (below), I propose to patch argument check. Sorry, this should be a pull-request, but I don't know how to use it

generate random integer in the interval [0,range) using PCG.

Since pcg32bounded only allowed 1 argument, the code can be optimized without overhead of parsing tuple. Just go straight for the number.

This is the full patch:

56,58c56,58
<      int n;
<     if (!PyArg_ParseTuple(args, "i", &n))
<         return NULL;
---
>     int n = PyLong_AsLong(args);   // 1 argument, ok for Python 2 or 3
>     if (n <= 0) return PyErr_Occurred() ? NULL : (Py_INCREF(Py_None), Py_None);
>
98c98
<      {"pcg32bounded", pcg32bounded, METH_VARARGS, "generate random integer in
the interval [0,range) using PCG."},
---
>      {"pcg32bounded", pcg32bounded, METH_O, "generate random integer in the in
terval [0,range) using PCG."},

With the patch, this is benchmark on my laptop:

> python -m timeit -s "import fastrand" "fastrand.pcg32bounded(1001)"
1000000 loops, best of 3: 0.208 usec per loop   -- original
10000000 loops, best of 3: 0.144 usec per loop  -- patched

Error messages now as expected:

>>> from fastrand import *
>>> print pcg32bounded(1001)
82
>>> print pcg32bounded(0)       
None                          # should not return anything
>>> print pcg32bounded(2**31-1), pcg32bounded(-2**31)
20852737 None                 # max input limits
>>> print pcg32bounded(2**31) # overflow 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to int
>>> print pcg32bounded('Hi')  # bad integer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: an integer is required
>>>
lemire commented 6 years ago

I implemented the change, and it is faster...

Before...

python -m timeit -s 'import fastrand' 'fastrand.pcg32bounded(1001)'
10000000 loops, best of 3: 0.0924 usec per loop

After...

python -m timeit -s 'import fastrand' 'fastrand.pcg32bounded(1001)'
10000000 loops, best of 3: 0.0602 usec per loop

My code differs slightly from yours, but should be equivalent.

Thanks.

achan001 commented 6 years ago

You welcome.

Some bad output and error messages with your patch.

Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> from fastrand import *        # PyLong_AsUnsignedLong PATCH
>>> print pcg32bounded(0)         # should not return anything
0
>>> pcg32bounded(-1)              # error message mysteriously delayed
41705474
>>>
OverflowError: can't convert negative value to unsigned long
>>>
>>> pcg32bounded('hi')            # Python 2 SystemError bug ?
-888685582
>>>
SystemError: ..\Objects\longobject.c:336: bad argument to internal function
>>>

My PyLong_AsLong patch may be better. It handled all the error message as expected. (and it match the description of pcg32bounded doc-string)

lemire commented 6 years ago

@achan001

Can you verify your exact patch on modern Pythons? Python 2.7 and 3.6?

I do not get at all what you are getting with your patch. And I don't think it makes sense to install an old Python.

achan001 commented 6 years ago

My patch verified OK with Python latest 3.6.5 (Windows x86 executive installer)

Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 16:07:46) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from fastrand import *
>>> print(pcg32bounded(1001))
82
>>> print(pcg32bounded(0))
None
>>> print(pcg32bounded(2**31-1), pcg32bounded(-2**31))
20852737 None
>>> print(pcg32bounded(2**31))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long
>>>
>>> print(pcg32bounded('Hi'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: an integer is required (got type str)

Perhaps you are using 64-bits system ? if (n <= 0) test may need some adjustment ...

What unexpected output are you getting ?

lemire commented 6 years ago

I merged your code (see GitHub). I used to get an error when inputting a negative value, now I just get nothing. It does not sound like progress to me.

See:

$ python3
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
Type "help", "copyright", "credits" or "license" for more information.
>>> import fastrand
>>> fastrand.pcg32bounded(-1)
>>> fastrand.pcg32bounded(0)
lemire commented 6 years ago

I would much rather get a SystemError than nothing when a negative value is used.

achan001 commented 6 years ago

You mean like this ? (I picked ValueError, for SystemError, use PyExc_SystemError)

>>> import fastrand
>>> fastrand.pcg32bounded(-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: no such random number exist
>>>
>>> fastrand.pcg32bounded(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: no such random number exist
>>>

This is the patch:

#if PY_MAJOR_VERSION >= 3
#define PyInt_AsLong(x)   PyLong_AsLong(x)
#endif

static PyObject*
pcg32bounded(PyObject* self, PyObject* args)
{
    int n = PyInt_AsLong(args);
    if (n > 0)
      return Py_BuildValue("i", pcg32_random_bounded_divisionless(n));
    if (!PyErr_Occurred())
      PyErr_SetString(PyExc_ValueError, "no such random number exist");
    return NULL;
}
achan001 commented 6 years ago

You are correct. Raising an exception is better than return None, warning problem immediately.

This is python gmpy2 behavior for non-exist values.

>>> gmpy2.invert(101, 12345)
mpz(9656)
>>>
>>> gmpy2.invert(105, 12345)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: invert() no inverse exists