Closed mgorny closed 10 months ago
Cc: @kumaraditya303
Reduced example, just in case:
import _asyncio
import _elementtree
async def _end_stream_wait():
await "foo"
parser = _elementtree.XMLParser()
_asyncio.Task(_end_stream_wait())
The problem is more complicated than it may seem at first glance.
First of all, problem is unrelated to isolation of _elementtree
. In fact, here's the first bad commit - b652d40f1c.
What's actually happening and why this code segfaults?
XMLParser
is trying to deallocate himself:https://github.com/python/cpython/blob/931f4438c92ec0eb2aa894092a91749f1d5bd216/Modules/_elementtree.c#L3777-L3785
Line 3784 (which actually means (st->expat_capi->ParserFree)(parser)
) leads to segfault if the second point (which described below) already executed.
pyexpat_capsule
trying to deallocate himself (st->expat_capi
of _elementtree
is pointing at this data!) Where's the problem? Order of execution of two points described above is not strict. Second point can be executed before first ponit and vice-versa (which is okay for us). Situation when pyexpat_capsule
deallocated before deallocation of XMLParser
leads to segmentation fault.
cc @vstinner
_elementtree
extension should make sure that the pyexpat
extension remains usable until it is done using it.
struct PyExpat_CAPI *expat_capi;
of elementtreestate is just a pointer, it doesn't impact lifetime of the capsule memory or the pyexpat
extension.
_elementtree
extension should make sure that thepyexpat
extension remains usable until it is done using it.
struct PyExpat_CAPI *expat_capi;
of elementtreestate is just a pointer, it doesn't impact lifetime of the capsule memory or thepyexpat
extension.
Yes, that is what we should do. But I don't understand how we can make sure that the pyexpat
extension remains usable until _elementtree
is done using it.
Can we keep a strong reference to the pyexpat module? Is it enough?
Can we keep a strong reference to the pyexat module? Is it enough?
There's no such API, and yes, it will be enough. Imported capsule doesn't store any information about module which is bounded to.
To be honest, I don't see any solution without changing some code of capsule's implementation..
To be honest, I don't see any solution without changing some code of capsule's implementation..
That's acceptable and has been done recently: see commit 513c89d0122ff6245cfefbb49ef32bde8a2336f5.
To be honest, I don't see any solution without changing some code of capsule's implementation..
That's acceptable and has been done recently: see commit 513c89d.
I've write a draft PR, please take a look at the #112053.
That's a kinda hard issue :smile:
First of all, there's a two segfaults, first is described above (https://github.com/python/cpython/issues/111784#issuecomment-1798311825)
Then, let's talk about second segfault. It's funny, but code segfaults at the same place! I mean EXPAT(st, ParserFree)(parser)
.
Why does it happens?
1._elementtree
module is trying to deallocate, so it's call module_dealloc
in moduleobject.c
There's a one interesting thing:
if (m->md_state != NULL)
PyMem_Free(m->md_state);
It's "clears" the state of module, if it's isn't freed previously. That's incorrect, because there is no guarantee that module doesn't use state to do some cleanup things in his deallocators. Ok, state is cleared, let's move on.
elementtree_clear
is called (or elementtree_free
, maybe it doesnt matter because elementtree_free
calls elementtree_clear
)Py_CLEAR(st->XMLParser_Type);
xmlparser_dealloc
which calls the xmlparser_gc_clear
.xmlparser_gc_clear
calls the EXPAT(st, ParserFree)(parser
) (reminder, it's interaction with the state).
But... state is already cleaned (now it's points to 0xDDDDDDDDDDDDDDDD
which mean that this pointer already freed) in first step! So, there's a segfault.I'll take an opportunity to add here release-blocker
label, because it's can happen in real user code. Do not forget, this segfault is present in 3.12, so in my opinion we should fix it until the next version of 3.12 is released.
That's really strange. IIRC, an extension module should never need to free it's allocated state explicitly. That sounds like a serious bug to me; module state should be freed by the runtime. (I haven't looked at the code yet.) cc. @vstinner @encukou
I wrote a PR, please check #113405
That's really strange. IIRC, an extension module should never need to free it's allocated state explicitly. That sounds like a serious bug to me; module state should be freed by the runtime. (I haven't looked at the code yet.) cc. @vstinner @encukou
Seems you misunderstand me. Extension module doesn't tries to free his state. module_dealloc
(I mean tp_dealloc
slot of module object which located in moduleobject.c
) called and frees the state. Why is it called - seems that there no references to module any longer, and this deallocate the state, which wrong, because some instances of XMLParser
need this state.
Thanks @mgorny for the report!
Thanks a lot!
Crash report
What happened?
I've originally hit it while running slixmpp's test suite. I've been able to reduce it to the following program:
I've been able to reproduce this with 3.12.0, and the tips of 3.12 and main branches. I've been able to bisect it to the following commit:
Backtrace:
CC @kumaraditya303
CPython versions tested on:
3.12, CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.13.0a1+ (heads/main:ba8aa1fd37, Nov 6 2023, 16:26:31) [GCC 13.2.1 20231014]
Linked PRs