log_to_postgres(level=ERROR) / PostgreSQL exceptions poorly handled (Python 3.11 support)

I've been trying to add support for Python 3.11 to multicorn2, and struggling with unexpected crashes. I believe I've identified the core of the problem.

When the Python function log_to_postgres is used with the level ERROR, as it is wisely used in some FDWs to report extended information about a problem... for example: https://github.com/pgsql-io/multicorn2/blob/b0a274c3aeec8341f93ec3f8328cad9f105ae4ee/python/multicorn/fsfdw/__init__.py#L286-L291

This executes errstart/errfinish with a level of ERROR, of course... https://github.com/pgsql-io/multicorn2/blob/b0a274c3aeec8341f93ec3f8328cad9f105ae4ee/src/utils.c#L84-L88

And as a result, this begins invoking PostgreSQL's C-based exception handling code, which uses setjmp/longjmp to move execution to the current error handler block (PG_TRY()...PG_CATCH()...PG_END_TRY()). However, the Python interpreter is in the middle of executing a C function call and its state is completely corrupt when control is never returned to the interpreter.

The specific interaction that is breaking right now is:

calling insert() on a FDW inspects some data and decides to do a log_to_postgres with an error
that causes the transaction to be aborted, causing multicorn_xact_callback to be invoked, and attempting to call rollback on the FDW
when running under valgrind, memory access errors occur at this point
finally a segfault occurs during the errorCheck() after rollback is invoked, when it attempts to import the traceback module -- strongly indicating to me that the interpreter is just fully unusable

Apparently in Python 3.10 and earlier the interpreter is able to cope with this, which is somewhat of a minor miracle to me. But in 3.11, by the time we re-enter the interpreter after this abrupt exit, it doesn't know how to cope and it begins to access invalid memory, crashing shortly afterwards.

I think the right fix for this would be to:

Change log_to_postgres so that ERROR and FATAL log levels are translated into Python exceptions, which are thrown back to the Python interpreter (eg. PostgresLogException)
In errorCheck(), check if the exception is a PostgresLogException, unwrap the details and log it, triggering Postgres' C exception handling but only after we're not in the middle of a Python invocation.

This would have the benefit of unwinding the Python stack -- for example any finally blocks and context managers that are in-use in Python would be exited cleanly. And I think it would leave the interpreter in a good clean state for re-entry later.

I think that this would still leave a residual problem which is likely "OK"-ish: any other place that could trigger PostgreSQL exceptions is going to fail to perform Python reference counting correctly. For example any invocation of errorCheck() in the code base today could trigger Postgres's exception handling which will prevent the following Py_DECREF from being invoked. That could lead to memory leaks in any long-running backends using multicorn, but, I think it's not of critical importance.

pgsql-io / multicorn2

log_to_postgres(level=ERROR) / PostgreSQL exceptions poorly handled (Python 3.11 support) #58