python / cpython

The Python programming language
https://www.python.org
Other
62.84k stars 30.1k forks source link

_PyErr_ChainExceptions1 no longer avallable to extension modules in Python 3.13 alpha headers #116809

Open rogerbinns opened 6 months ago

rogerbinns commented 6 months ago

Bug report

Bug description:

The API _PyErr_ChainExceptions1 has been removed from public headers as part of this change in Python 3.13 alpha. It is the only C API available to chain exceptions replacing _PyErr_ChainExceptions. That change was publicly documented in the 3.12 whatsnew.

In my extemsion module I use _PyErr_ChainExceptions for exception chaining for Python <= 3.11 from PEP 490 and _PyErr_ChainExceptions1for 3.12+.

Please provide an approved way for C extensions to continue to chain exceptions. The function is still present and I can duplicate the prototype for all to continue working, but that is not a desirable solution.

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

sobolevn commented 6 months ago

CC @vstinner

vstinner commented 6 months ago

cc @iritkatriel

iritkatriel commented 6 months ago

I wouldn't just make this function public without thinking through its design. It's not a very common operation so we need think what its semantics and use case are.

vstinner commented 6 months ago

Maybe right now, we can just revert _PyErr_ChainExceptions1. Not sure about _PyErr_ChainExceptions.

iritkatriel commented 6 months ago

Maybe right now, we can just revert _PyErr_ChainExceptions1. Not sure about _PyErr_ChainExceptions.

I agree.

ronaldoussoren commented 6 months ago

_PyErr_ChainExceptions1 is also reproduced using public API. The actual chaining is done using PyException_SetContext, and there's also API for fetching the current exception (PyErr_GetRaisedException in 3.12 and PyErr_Fetch before that).

vstinner commented 6 months ago

First PR: issue gh-116900 restores the private function _PyErr_ChainExceptions1().

IMO it's on time to add a public function _PyErr_ChainExceptions().

In Python, there are around 33 calls to this function.

In PyPI top 8,000 projects (at 2023-11-15), I found 2 projects using _PyErr_ChainExceptions1(). And well, the reporter of this issue also uses it :-)

But if I look for _PyErr_ChainExceptions(), I found 6 projects using it in PyPI top 8,000:

Then if I look for PyException_SetContext(), the number of projects goes up to 19 projects using it in PyPI top 8,000:

@ronaldoussoren:

_PyErr_ChainExceptions1 is also reproduced using public API. The actual chaining is done using PyException_SetContext, and there's also API for fetching the current exception (PyErr_GetRaisedException in 3.12 and PyErr_Fetch before that).

In that case, I can easily emulate this function in the pythoncapi-compat project to provide a new public function to old Python versions!

iritkatriel commented 6 months ago

IMO it's on time to add a public function _PyErr_ChainExceptions().

I'm not sure.

I think @ronaldoussoren 's point was that it's not fundamental (you can implement it with the current API).

My view is that it's not necessarily a common enough operation that we need to provide in the API.

rogerbinns commented 6 months ago

As long as the documentation says how to chain exceptions I'd be happy. Currently it doesn't tell you. A general web search only turns up PEP490 mentioning _PyErr_ChainExceptions. Using Google's Gemini says to call both PyException_SetContext and PyException_SetCause. Chatgpt gives nonsense.

The nice thing about an API with ChainExceptions in the name is that it is very clear at a semantic level what the developer is trying to achieve.

vstinner commented 6 months ago

I think @ronaldoussoren 's point was that it's not fundamental (you can implement it with the current API).

It's about having a convenient API for this operation:

/* Like PyErr_SetRaisedException(), but if an exception is already set,
   set the context associated with it.

   The caller is responsible for ensuring that this call won't create
   any cycles in the exception context chain. */
void
_PyErr_ChainExceptions1(PyObject *exc)
{
    if (exc == NULL) {
        return;
    }
    PyThreadState *tstate = _PyThreadState_GET();
    if (_PyErr_Occurred(tstate)) {
        PyObject *exc2 = _PyErr_GetRaisedException(tstate);
        PyException_SetContext(exc2, exc);
        _PyErr_SetRaisedException(tstate, exc2);
    }
    else {
        _PyErr_SetRaisedException(tstate, exc);
    }
}

Having to copy/paste this function in each project doesn't sound great. And I would prefer that users don't consume private APIs.

iritkatriel commented 6 months ago

It's about having a convenient API for this operation

Why should there be a convenient API for this operation? What is this operation? What is the use case for it? How common is it? How would you document it as a public API?

The name of this function is not right - I could expect ChainExceptions to take two exceptions and chain them. So how would you name this operation?

I would prefer that users don't consume private APIs.

There are public versions of the private functions you call in this implementation.

rogerbinns commented 6 months ago

I could expect ChainExceptions to take two exceptions and chain them

It does in the sense that one is the supplied parameter, and the other is the already existing raised exception.

The context in which I need to chain exceptions is having a C level callback from a third party library and on entry there is already a raised exception. I stash that one away (PyErr_Fetch), do the callback work including calling back into CPython, and on completion that too has raised an exception. So now there are two exceptions and the need to chain them.

ronaldoussoren commented 6 months ago

IMO it's on time to add a public function _PyErr_ChainExceptions().

I'm not sure.

I think @ronaldoussoren 's point was that it's not fundamental (you can implement it with the current API).

Indeed. The function is easily implemented using public API if you need it.

My view is that it's not necessarily a common enough operation that we need to provide in the API.

In the few cases where I use chaining in a C extension I'd prefer a variant of PyErr_Format and/or PyErr_FromString with a flag that chains the new exception to the currently active one.

iritkatriel commented 6 months ago

In the few cases where I use chaining in a C extension I'd prefer a variant of PyErr_Format and/or PyErr_FromString with a flag that chains the new exception to the currently active one.

I do wonder why they don't always do that.

But in the use case described by @rogerbinns you would need to call the new versions of these function in the called code, rather than in the calling code where the chaining occurs (and you already have the exception object you want to chain).

vstinner commented 6 months ago

I do wonder why they don't always do that.

Well, my PEP 490 "Chain exceptions at C level" got rejected (in 2015) :-)

ronaldoussoren commented 6 months ago

I think @ronaldoussoren 's point was that it's not fundamental (you can implement it with the current API).

It's about having a convenient API for this operation:

/* Like PyErr_SetRaisedException(), but if an exception is already set,
   set the context associated with it.

   The caller is responsible for ensuring that this call won't create
   any cycles in the exception context chain. */
void
_PyErr_ChainExceptions1(PyObject *exc)
{
    if (exc == NULL) {
        return;
    }
    PyThreadState *tstate = _PyThreadState_GET();
    if (_PyErr_Occurred(tstate)) {
        PyObject *exc2 = _PyErr_GetRaisedException(tstate);
        PyException_SetContext(exc2, exc);
        _PyErr_SetRaisedException(tstate, exc2);
    }
    else {
        _PyErr_SetRaisedException(tstate, exc);
    }
}

Having to copy/paste this function in each project doesn't sound great. And I would prefer that users don't consume private APIs.

About the last bit: The implementation in CPython uses private APIs, but you can implement this without using private APIs (basically all of the consumed private are public APIs in 3.12 when you drop the leading underscore and the state argument).

As mentioned in my previous message the shape of this API doesn't cleanly match how I currently set the context or cause in my own code. I always create a new exception that I'd like to chain to the currently active exception (if any).

That is (ignoring reference count updates for clarity):

    PyObject* current_exception = PyErr_GetRaisedException();

    PyErr_Format(...);
    PyObject* new_exception = PyErr_GetRaisedException();

    PyException_SetContext(new_exception, current_exception);
    PyErr_SetRaisedException(new_exception);

This code isn't simplified by using _PyErr_ChainExceptions1. New versions of PyErr_Format and PyErr_FromString that can set the cause or context from the currently active exception would be much more convenient (and I'd use chaining more than I do now).

@rogerbinns, what's your use case for chaining? Do you want to chain to a pre-existing exception, or does your use case match mine?

vstinner commented 6 months ago

In 2021, @pablogsal asked to expose _PyErr_ChainExceptions() in the stable ABI (emphasis is mine) in gh-89101:

Currently, exception chaining in the C-API must be done by hand. This is a painstaking process that involves several steps and is very easy to do incorrectly.

We currently have a private function: _PyErr_ChainExceptions that does this job. Given that exception chaining is a very fundamental operation, this functionality must be exposed in the stable C-API

rogerbinns commented 6 months ago

@ronaldoussoren my use case is dealing with a sequence of callbacks from a C library that I then pass on back to Python code. I don't actually create any of the exceptions.

In pseudo-code:

void callback(param1, param2, param3)
{
  PyObject *pending_exception = NULL;
  PyGILState_STATE gilstate = PyGILState_Ensure();

  if(PyErr_Occurred())
    pending_exception = PyErr_GetRaisedException();

  .. do lots of work calling PyObject_GetAttr, making tuples, ints and
    str, eventually PyObject_Call.  On error goto finally;

finally:
  if(PyErr_Occurred() && pending_exception)
       _PyErr_ChainExceptions1(pending_exception);

  PyGILState_Release(gilstate);
}
rogerbinns commented 6 months ago

New versions of PyErr_Format and PyErr_FromString that can set the cause or context from the currently active exception would be much more convenient

It is important to note that will not work in the general case. The fundamental problem is that many of the CPython APIs behave differently if there is already a pending exception when they are called, and in some cases clear the pending exception. Usually the problem is their implementation can't tell the difference between an exception being present because they caused it as a failure case, or if it was already there.

For manual or automatic chaining to work, every CPython API needs to be audited and updated to ensure that it behaves as expected if there is a pending exception on entry. I don't envy trying to update _Py_CheckFunctionResult!

rogerbinns commented 6 months ago

@ronaldoussoren I think it may also be the case that you don't really want to chain an exception, but rather want to add a note to the existing exception PEP678 style. There isn't a CPython API for that either. I have my own implementation which is reproduced in edited form below. In order to add the note, you have to create the string, and call the add_note method on the exception object. That could of course fail, so this function also needs to be able to chain exceptions!

static void PyErr_AddExceptionNote(const char *format, ...)
{
  va_list fmt_args;
  va_start(fmt_args, format);

  PyObject *pending_exception = PyErr_GetRaisedException();
  if(!pending_exception)
  {
    PyErr_BadInternalCall();
    goto finally;
  }

  PyObject *message = PyUnicode_FromFormatV(format, fmt_args);
  if(!message)
  {
    _PyErr_ChainExceptions1(pending_exception);
    goto finally;
  }

  PyObject *vargs[] = {NULL, pending_exception, message};
  /* it is assumed that ADD_NOTE is an already created PyUnicode of "add_note" */
  PyObject *call_res = PyObject_VectorcallMethod(ADD_NOTE, vargs + 1, 2 | PY_VECTORCALL_ARGUMENTS_OFFSET, NULL);
  if(!call_res)
    _PyErr_ChainExceptions1(pending_exception);
  Py_XDECREF(call_res);
  Py_DECREF(message);

finally:
  PyErr_SetRaisedException(pending_exception);
  va_end(fmt_args);
}
iritkatriel commented 6 months ago

I think if the current exception already has a context, then _PyErr_ChainExceptions1() overwrites it. We need to think this through too.