python / cpython

The Python programming language
https://www.python.org
Other
62.88k stars 30.12k forks source link

C API: Cython 3.0 uses private functions removed in Python 3.13 (numpy 1.25.1 fails to build) #107076

Closed vstinner closed 9 months ago

vstinner commented 1 year ago

numpy 1.25.1 fails to build on Python 3.13: Cython uses private C API removed in Python 3.13. Example of build errors.

I open an issue since Python 3.13 does not provide obvious replacement for these removed functions.

(1) Cython uses removed _PyInterpreterState_GetConfig():

  static void __Pyx_init_assertions_enabled(void) {
    __pyx_assertions_enabled_flag = ! _PyInterpreterState_GetConfig(__Pyx_PyThreadState_Current->interp)->optimization_level;
  }

(2) Cython uses removed _PyVectorcall_Function() in __Pyx_PyObject_FastCallDict():

static CYTHON_INLINE PyObject* __Pyx_PyObject_FastCallDict(PyObject *func, PyObject **args, size_t _nargs, PyObject *kwargs) {
    ..
    #if CYTHON_VECTORCALL
    vectorcallfunc f = _PyVectorcall_Function(func);
    if (f) {
        return f(func, args, (size_t)nargs, kwargs);
    }
    ...
}

(3) Cython uses removed _PyUnicode_FastCopyCharacters() in __Pyx_PyUnicode_ConcatInPlaceImpl()

static CYTHON_INLINE PyObject *__Pyx_PyUnicode_ConcatInPlaceImpl(PyObject **p_left, PyObject *right
        #if CYTHON_REFNANNY
        , void* __pyx_refnanny
        #endif
    ) {
        ...
        // copy 'right' into the newly allocated area of 'left'
        _PyUnicode_FastCopyCharacters(*p_left, left_len, right, 0, right_len);
        return *p_left;
        ...
  }

But also in __Pyx_PyUnicode_Join():

static PyObject* __Pyx_PyUnicode_Join(PyObject* value_tuple, Py_ssize_t value_count, Py_ssize_t result_ulength,
                                      Py_UCS4 max_char) {
            ...
            #if CYTHON_COMPILING_IN_CPYTHON && PY_VERSION_HEX >= 0x030300F0 || defined(_PyUnicode_FastCopyCharacters)
            _PyUnicode_FastCopyCharacters(result_uval, char_pos, uval, 0, ulength);
            #else
            ...
}
vstinner commented 1 year ago

(1) Cython uses removed _PyInterpreterState_GetConfig()

In May 2022, I deprecated legacy global configuration variables like Py_VerboseFlag: issue #93103.

In November 2022, @scoder (one of Cython maintainers) created issue #99872 "Allow efficient read access to PyConfig options".

vstinner commented 1 year ago

(2) Cython uses removed _PyVectorcall_Function() in __Pyx_PyObject_FastCallDict()

The public PyVectorcall_Function() function is available since Python 3.8: https://docs.python.org/dev/c-api/call.html#c.PyVectorcall_Function

vstinner commented 1 year ago

(3) Cython uses removed _PyUnicode_FastCopyCharacters() in __Pyx_PyUnicode_ConcatInPlaceImpl()

The public PyUnicode_CopyCharacters() can be used instead, it's available since Python 3.3: https://docs.python.org/dev/c-api/unicode.html#c.PyUnicode_CopyCharacters

Is a fast unsafe replacement needed here? Where unsafe means: without error checking (use assertions instead).

scoder commented 1 year ago

PyUnicode_CopyCharacters Is a fast unsafe replacement needed here? Where unsafe means: without error checking (use assertions instead).

Well. It would be more fair in comparison, although that's barely a reason. For the case that characters are copied between different unicode string types a lot, especially for loads of short strings, this difference would probably be visible, i.e. Py3.13 would appear slower than Py3.12 here. But proving that would require a proper, realistic benchmark. I'll use the public function in Cython on Py3.13 for now and wait for an actual user to complain.

vstinner commented 1 year ago

It's unclear to me why Python exposes an API to modify immutable strings 😁 I'm considering adding a public API for _PyUnicodeWriter which may cover _PyUnicode_FastCopyCharacters() usecase.

I'm also open to expose _PyUnicode_FastCopyCharacters() in public with an appropriate documentation.

da-woods commented 1 year ago

(2) Cython uses removed _PyVectorcall_Function() in __Pyx_PyObject_FastCallDict()

The public PyVectorcall_Function() function is available since Python 3.8: https://docs.python.org/dev/c-api/call.html#c.PyVectorcall_Function

This isn't true - it's documented badly. https://github.com/python/cpython/pull/107140

I mention it because the misleading documentation caught us out when trying to replace it (and you out in the comment above)

da-woods commented 1 year ago

(Deleted previous comment because it wasn't quite right).

The __Pyx_PyUnicode_ConcatInPlaceImpl is used to optimize repeated addition to the same string. It's an anti-pattern that'd be better replaced with .join but it's an optimization CPython does internally too, and it does make a difference. I wouldn't worry too much about a performance degradation there.

The use case in .join is probably more significant given that's the recommended way to write string concatenations.

vstinner commented 1 year ago

I'm not asking Cython to use slower code path on purpose. I created this issue to discuss which private functions are used by Cython and see what can be done to provide better APIs for Cython.

vstinner commented 1 year ago

By the way, the main branch of Cython uses the following private _PyDict functions:

Maybe we should consider making them public, or Cython should avoid them (which can have a negative effect on performance). See PR #107145.

I kept these functions for now, to not make break Cython more :-)

scoder commented 1 year ago

@da-woods has collected a list of CPython internals that we use in https://github.com/cython/cython/issues/4635

All of these (minus bugs) are guarded by version checks or feature flags (like CYTHON_USE_PYLIST_INTERNALS).

I think the dict internals only use version checks, probably because the dict implementation tends to change every couple of releases. ~We might want to add a general feature flag for the dict internals as well, to make their access easy to disable for users.~ Looking a bit more through the implementation, we don't actually use "internals" in the sense of direct dict object struct access. We do use private C-API functions, though, in order to avoid performance disadvantages compared to CPython. For example, the _KnownHash functions are used when looking up interned strings for which the hash value has definitely been calculated at initialisation time and anything but a straight, unchecked lookup would just be useless overhead.

In any case, there is always generic fallback code that only relies on public C-API features, usually for other Python implementations than CPython, or for the Limited API compilation mode.

vstinner commented 1 year ago

We do use private C-API functions, though, in order to avoid performance disadvantages compared to CPython

I totally get it and I create this issue to discuss that: should we expose these private functions? If they help to make C extensions faster, maybe we should expose them so more C extensions can benefit from them.

serhiy-storchaka commented 1 year ago

For example, the _KnownHash functions are used when looking up interned strings for which the hash value has definitely been calculated at initialisation time and anything but a straight, unchecked lookup would just be useless overhead.

I afraid that it is a preliminary optimization, because every dict lookup function first at all checks whether the key is a string with the cached hash, and then does a straight lookup. The overhead is virtually just the comparison of two pointers.

_KnownHash functions are more useful in case of slow hashing function, for example in lru_cache() implementation.

vstinner commented 10 months ago

See also https://github.com/python/cpython/issues/89410 "[C API] Add explicit support for Cython to the C API".

vstinner commented 9 months ago

I mark this issue as duplicate of https://github.com/python/cpython/issues/89410

bje- commented 3 months ago

Similarly, nanosvg (bundled with wxPython) fails to compile with Python 3.13 because cython generates code that uses private Python API functions whose signatures have changed (namely, _PyLong_AsByteArray).

vstinner commented 3 months ago

Similarly, nanosvg (bundled with wxPython) fails to compile with Python 3.13 because cython generates code that uses private Python API functions whose signatures have changed (namely, _PyLong_AsByteArray).

I suggest to use the new APIs:

bje- commented 3 months ago

Similarly, nanosvg (bundled with wxPython) fails to compile with Python 3.13 because cython generates code that uses private Python API functions whose signatures have changed (namely, _PyLong_AsByteArray).

Newer versions of Cython fix the problem, but I agree that it would be better for Cython to not be using private API functions where possible.