cython / cython

The most widely used Python to C compiler
https://cython.org
Apache License 2.0
9.47k stars 1.49k forks source link

[BUG] new warning `__sync_fetch_and_sub_4` overflows destination #5937

Open rgommers opened 9 months ago

rgommers commented 9 months ago

Describe the bug

This warning is new in Cython 3 I believe:

[213/220] Compiling C object scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/meson-generated__ppoly.c.o
In function '__Pyx_XCLEAR_MEMVIEW',
    inlined from '__pyx_pf_5scipy_11interpolate_6_ppoly_20evaluate_nd' at scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:22469:5,
    inlined from '__pyx_fuse_0__pyx_pw_5scipy_11interpolate_6_ppoly_21evaluate_nd' at scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:22135:13:
scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:1577:46: warning: '__sync_fetch_and_sub_4' writing 4 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
 1577 |     #define __pyx_atomic_decr_aligned(value) __sync_fetch_and_sub(value, 1)
      |                                              ^~~~~~~~~~~~~~~~~~~~
scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:1604:13: note: in expansion of macro '__pyx_atomic_decr_aligned'
 1604 |             __pyx_atomic_decr_aligned(__pyx_get_slice_count_pointer(memview))
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~
scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:46077:29: note: in expansion of macro '__pyx_sub_acquisition_count'
46077 |     old_acquisition_count = __pyx_sub_acquisition_count(memview);
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/rgommers/mambaforge/envs/scipy-dev/include/python3.10/Python.h:74,
                 from scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:16:
/home/rgommers/mambaforge/envs/scipy-dev/include/python3.10/object.h: In function '__pyx_fuse_0__pyx_pw_5scipy_11interpolate_6_ppoly_21evaluate_nd':
/home/rgommers/mambaforge/envs/scipy-dev/include/python3.10/object.h:605:22: note: at offset 56 into destination object '_Py_NoneStruct' of size 16
  605 | PyAPI_DATA(PyObject) _Py_NoneStruct; /* Don't use this directly */
      |                      ^~~~~~~~~~~~~~
In function '__Pyx_XCLEAR_MEMVIEW',
    inlined from '__pyx_pf_5scipy_11interpolate_6_ppoly_22evaluate_nd' at scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:23624:5,
    inlined from '__pyx_fuse_1__pyx_pw_5scipy_11interpolate_6_ppoly_23evaluate_nd' at scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:23290:13:
scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:1577:46: warning: '__sync_fetch_and_sub_4' writing 4 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
 1577 |     #define __pyx_atomic_decr_aligned(value) __sync_fetch_and_sub(value, 1)
      |                                              ^~~~~~~~~~~~~~~~~~~~
scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:1604:13: note: in expansion of macro '__pyx_atomic_decr_aligned'
 1604 |             __pyx_atomic_decr_aligned(__pyx_get_slice_count_pointer(memview))
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~
scipy/interpolate/_ppoly.cpython-310-x86_64-linux-gnu.so.p/_ppoly.c:46077:29: note: in expansion of macro '__pyx_sub_acquisition_count'
46077 |     old_acquisition_count = __pyx_sub_acquisition_count(memview);
      |                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/rgommers/mambaforge/envs/scipy-dev/include/python3.10/object.h: In function '__pyx_fuse_1__pyx_pw_5scipy_11interpolate_6_ppoly_23evaluate_nd':
/home/rgommers/mambaforge/envs/scipy-dev/include/python3.10/object.h:605:22: note: at offset 56 into destination object '_Py_NoneStruct' of size 16
  605 | PyAPI_DATA(PyObject) _Py_NoneStruct; /* Don't use this directly */

Code to reproduce the behaviour:

I'm seeing this when building SciPy with Python 3.10 and GCC 12.2.0. The warning is triggered here: https://github.com/scipy/scipy/blob/2d6be1c71322c33a7151f1620d633711715b6a7b/scipy/interpolate/_ppoly.pyx#L174

That code seems to come down to:

def evaluate_nd(..., tuple xs, ...):
    cdef double[::1] y
    cdef int ip, ndim

    ndim = len(xs)
    for ip in range(ndim-1, -1, -1):
        y = xs[ip]

And then calling that with an xs input which is a tuple of ndarrays.

Perhaps it's clear to someone what is happening here without having to create a full standalone reproducer.

Expected behaviour

No build warning.

OS

Linux

Python version

3.10.10

Cython version

3.0.7 (also present in 3.0.4)

Additional context

No response

da-woods commented 9 months ago

I can reproduce that warning (although testing it on scipy's code is the first time I've seen it).

I think it's a false positive. The internal code in question:

static CYTHON_INLINE void __Pyx_XCLEAR_MEMVIEW(__Pyx_memviewslice *memslice,
                                             int have_gil, int lineno) {
    __pyx_nonatomic_int_type old_acquisition_count;
    struct __pyx_memoryview_obj *memview = memslice->memview;
    if (unlikely(!memview || (PyObject *) memview == Py_None)) {
        memslice->memview = NULL;
        return;
    }
    old_acquisition_count = __pyx_sub_acquisition_count(memview);

So I think it's detected that memslice->memview points to a _Py_NoneStruct object, and hasn't realized that None is a singleton so the memview == Py_None prevents __pyx_sub_acquisition_count from ever being run in that case.

In Cython <3 the reference counting was slightly more indirect the compiler probably couldn't work as much out.

If I comment out y = None at the end of the loop then the warning goes away.

It'd definitely be good if Cython could avoid generating this warning. We could possibly do so just by not inlining __Pyx_XCLEAR_MEMVIEW but that might have its own performance costs (or gains...). I can't think of too many other things we could do.

If we accept my belief that it's a false positive, then I think Scipy can safely ignore it for now. I just haven't quite convinced myself I'm completely sure I'm right yet.

da-woods commented 9 months ago

We could possibly do so just by not inlining __Pyx_XCLEAR_MEMVIEW but that might have its own performance costs (or gains...). I can't think of too many other things we could do.

Confirmed that this does make the warning go away. Still not sure if we want to do it though.

rgommers commented 9 months ago

Thanks for having a look @da-woods.

If I comment out y = None at the end of the loop then the warning goes away.

That y = None line doesn't seem to do much, it was probably just added as a safety measure. But looks like it can be deleted. So no warnings from Cython is always good, but the decision here shouldn't matter to SciPy.

rgommers commented 9 months ago

Hmm, not 100% of that "can be deleted". The code generated for it is:

    __pyx_t_8 = __Pyx_PyObject_to_MemoryviewSlice_dc_double(Py_None, PyBUF_WRITABLE); if (unlikely(!__pyx_t_8.memview)) __PYX_ERR(0, 184, __pyx_L1_error)
    __PYX_XCLEAR_MEMVIEW(&__pyx_v_y, 1);
    __pyx_v_y = __pyx_t_8;
    __pyx_t_8.memview = NULL;
    __pyx_t_8.data = NULL;
  }

Which may be needed for bookkeeping. However, for the y = xs[ip] line higher up in the for-loop we see the same pattern:

 *         y = xs[ip]             # <<<<<<<<<<<<<<
    if (unlikely(__pyx_v_xs == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      __PYX_ERR(0, 174, __pyx_L1_error)
    }
    __pyx_t_8 = __Pyx_PyObject_to_MemoryviewSlice_dc_double(PyTuple_GET_ITEM(__pyx_v_xs, __pyx_v_ip), PyBUF_WRITABLE); if (unlikely(!__pyx_t_8.memview)) __PYX_ERR(0, 174, __pyx_L1_error)
    __PYX_XCLEAR_MEMVIEW(&__pyx_v_y, 1);
    __pyx_v_y = __pyx_t_8;
    __pyx_t_8.memview = NULL;
    __pyx_t_8.data = NULL;

It's not clear to me if this line makes any difference or Cython gives any kind of guarantee for this code:

        # grab array pointers
        nxx[ip] = y.shape[0]
        xx[ip] = <double*>&y[0]
        y = None
da-woods commented 9 months ago

I don't think it makes any difference - the assignment further up the loop takes care of the reference counting either way. Unless you're doing if y is None or possibly if y then I think it should all be largely the same.

I don't see why it should make any difference to the "grab array pointers" code either.