python / cpython

The Python programming language
https://www.python.org
Other
63.39k stars 30.36k forks source link

Add _PyEval_SetTrace(tstate, func, arg) function #79551

Open a5ae9dfd-b19f-4ed7-b289-70236f9e865e opened 5 years ago

a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 5 years ago
BPO 35370
Nosy @vstinner, @fabioz
PRs
  • python/cpython#18975
  • python/cpython#18977
  • python/cpython#19029
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['interpreter-core', 'type-feature', '3.9'] title = 'Add _PyEval_SetTrace(tstate, func, arg) function' updated_at = user = 'https://github.com/fabioz' ``` bugs.python.org fields: ```python activity = actor = 'fabioz' assignee = 'none' closed = False closed_date = None closer = None components = ['Interpreter Core'] creation = creator = 'fabioz' dependencies = [] files = [] hgrepos = [] issue_num = 35370 keywords = ['patch'] message_count = 21.0 messages = ['330849', '364081', '364082', '364085', '364086', '364087', '364088', '364092', '364093', '364097', '364098', '364100', '364101', '364102', '364109', '364337', '364341', '364342', '364344', '364350', '364353'] nosy_count = 2.0 nosy_names = ['vstinner', 'fabioz'] pr_nums = ['18975', '18977', '19029'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue35370' versions = ['Python 3.9'] ```

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 5 years ago

    Right now it's hard for debuggers to set the tracing function to be used for running threads.

    This would be really handy for debuggers when attaching to a running program to debug all threads.

    -- Note: currently there is a way to achieve that by pausing all the threads then selectively switching to a thread to make it current and setting the tracing function using the C-API (see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd_attach_to_process/dll/attach.cpp#L1224), but I believe this is very hacky and not portable to other Python implementations.

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    As a note, the workaround is now in https://github.com/fabioz/PyDev.Debugger/blob/pydev_debugger_1_9_0/pydevd_attach_to_process/common/py_settrace_37.hpp#L150

    vstinner commented 4 years ago

    Can't you use PyEval_SetTrace()? https://docs.python.org/dev/c-api/init.html#c.PyEval_SetTrace

    vstinner commented 4 years ago

    Note: currently there is a way to achieve that by pausing all the threads then selectively switching to a thread to make it current and setting the tracing function using the C-API (see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd_attach_to_process/dll/attach.cpp#L1224), but I believe this is very hacky and not portable to other Python implementations.

    I'm not sure that I understand your need. Do you need a variant of PyEval_SetTrace() which accepts a tstate argument?

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    I'd like to, but it only sets the tracing to the currently running thread (the request is for setting the tracing for other threads).

    Just for info, the problem I'm solving is that the debugger is multi-threaded, but the user can start the code without any tracing in place, and then, when it gets to an attach to process or a programmatic attach to the debugger, the debugger needs to set the tracing to threads that are already running (and any new thread created afterward) and Python doesn't really have any structure for that in place (so, I'm using the C-API as a workaround to do what PyEval_SetTrace does but targeting any thread).

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    > Note: currently there is a way to achieve that by pausing all the threads then selectively switching to a thread to make it current and setting the tracing function using the C-API (see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd_attach_to_process/dll/attach.cpp#L1224), but I believe this is very hacky and not portable to other Python implementations.

    I'm not sure that I understand your need. Do you need a variant of PyEval_SetTrace() which accepts a tstate argument?

    Yes

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    Maybe better would be the thread id so that the tstate structure is not needed there.

    vstinner commented 4 years ago

    Attached PR 18975 adds _PyEval_SetTrace(tstate, func, arg) function. It requires that the caller holds the GIL. The function calls PySys_Audit() in the context of the current thread state (not in "tstate").

    vstinner commented 4 years ago

    Maybe better would be the thread id so that the tstate structure is not needed there.

    PyInterpreterState.tstate_head allows to iterate on all thread states, but I'm not aware of an API to get a PyThreadState from its identifier (tstate->id).

    --

    See also bpo-1021318: "PyThreadState_Next not thread safe".

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    I'm iterating on all threads and getting its thread id to find out the thread state (in my use case) and then doing what you just did there...

    So, while this solution does work for me, if the idea is making tstate opaque, then having (an optional) thread id in settrace (which iterates to find the proper thread if given) could solve it without having to rely on any CPython internals on my side (although it should probably return a bool to say if it did work then).

    vstinner commented 4 years ago

    New changeset 309d7cc5df4e2bf3086c49eb2b1b56b929554500 by Victor Stinner in branch 'master': bpo-35370: Add _PyEval_SetTrace() function (GH-18975) https://github.com/python/cpython/commit/309d7cc5df4e2bf3086c49eb2b1b56b929554500

    vstinner commented 4 years ago

    PyEval_SetTrace() and PyEval_SetProfile() function have no return value but can raise an exception. I'm not comfortable about these functions. Maybe one option would be to call PyErr_WriteUnraisable() on PySys_Audit() error. It might be better than "leaking" an exception which is unexpected to PyEval_SetTrace() and PyEval_SetProfile() callers.

    vstinner commented 4 years ago

    PyEval_SetTrace() and PyEval_SetProfile() function have no return value but can raise an exception. I'm not comfortable about these functions. Maybe one option would be to call PyErr_WriteUnraisable() on PySys_Audit() error. It might be better than "leaking" an exception which is unexpected to PyEval_SetTrace() and PyEval_SetProfile() callers.

    I wrote PR 18977 to implement this idea.

    vstinner commented 4 years ago

    Fabio: I added _PyEval_SetProfile() and _PyEval_SetTrace() which take a tstate parameter. These functions have a constraint: the caller must hold the GIL. Is it an acceptable constraint for you?

    That's not something new, it's already the code in Python 3.8, it's just that it wasn't documented. In Python 3.7, it was less important: Python 3.8 added a call to PySys_Audit() which can execute arbitrary Python code.

    Anyway, touching Python internals without holding the GIL is risky: see bpo-1021318 for example.

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    Holding the GIL is a reasonable constraint.

    As a note, the original request was for a Python-level tracing function (so that in the future other Python implementations also provide that function) -- does this need a PEP?

    vstinner commented 4 years ago

    New changeset f6a58507820c67e8d0fb07875cd1b1d9f5e510a8 by Victor Stinner in branch 'master': bpo-35370: PyEval_SetTrace() logs unraisable error (GH-18977) https://github.com/python/cpython/commit/f6a58507820c67e8d0fb07875cd1b1d9f5e510a8

    vstinner commented 4 years ago

    New changeset 046255c40fc0d9c5a4c528eb5955792fa08df66f by Victor Stinner in branch '3.8': bpo-35370: PyEval_SetTrace() logs unraisable error (GH-18977) (GH-19029) https://github.com/python/cpython/commit/046255c40fc0d9c5a4c528eb5955792fa08df66f

    vstinner commented 4 years ago

    As a note, the original request was for a Python-level tracing function (so that in the future other Python implementations also provide that function) -- does this need a PEP?

    What do you mean by a Python-level tracing function?

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    > As a note, the original request was for a Python-level tracing function (so that in the future other Python implementations also provide that function) -- does this need a PEP?

    What do you mean by a Python-level tracing function?

    I mean that it's a function to be called from Python (not only from C) -- which hopefully could be adopted by other Python implementations in the long run.

    I.e.: something as adding a thread_id to sys.settrace -- sys.settrace(trace_func, thread_id=None).

    vstinner commented 4 years ago

    I.e.: something as adding a thread_id to sys.settrace -- sys.settrace(trace_func, thread_id=None).

    What is the use case for this feature?

    It seems quite complicated to implement the thread_id for sys.settrace(trace_func, thread_id=None).

    Currently, there is no way for a thread to execute code directly in another thread. In asyncio, it has to through call_soon_threadsafe() which queues the function calls and the function is called "later" and the caller doesn't get the result.

    https://docs.python.org/dev/library/asyncio-eventloop.html#asyncio.loop.call_soon_threadsafe

    sys.setprofile() and sys.settrace() use the current Python thread state (tstate). The threading.enumerate() function returns threading.Thread instances, but it's not currently possible to get the Python thread state (C structure PyThreadState) from a Python threading.Thread object.

    At the C level, Python doesn't maintain a list of thread. There is only threading.enumerate() which is implemented in Python.

    PyDev.Debugger seems to use the C API. Can it continue to use the C API?

    Note: There is threading.setprofile() and threading.settrace() which set a profile/trace function when *new* threads are spawned.

    a5ae9dfd-b19f-4ed7-b289-70236f9e865e commented 4 years ago

    > I.e.: something as adding a thread_id to sys.settrace -- sys.settrace(trace_func, thread_id=None).

    What is the use case for this feature?

    The use case is having the user attach the debugger (either programmatically or by doing an attach to process) and be able to debug all threads, not just the current thread.

    It seems quite complicated to implement the thread_id for sys.settrace(trace_func, thread_id=None).

    Humm, isn't it just a matter of passing the proper tstate to _PyEval_SetTrace? It seems reasonably simple to do at C (i.e.: iterate the existing thread states to get the thread id and then pass the proper tsate to _PyEval_SetTrace -- which is roughly what is done in the debugger, although it's a bit more complicated because it supports Python 2.7 up to Python 3.8...).

    At the C level, Python doesn't maintain a list of thread. There is only threading.enumerate() which is implemented in Python.

    The tstate does contain the thread id, so, iterating the available tstates should be enough for that.

    PyDev.Debugger seems to use the C API. Can it continue to use the C API?

    It can for CPython, but it can't for other Python implementations (and ideally I'd like to rely less on the CPython C-API -- because there's too much implementation details on it, things seem to break at each new version).

    Note: There is threading.setprofile() and threading.settrace() which set a profile/trace function when *new* threads are spawned

    Yes, I know about those, but it's not enough if the user attaches the debugger after the process is already running.