Consider a "brief" native mode only showing C frames below the last Python frame

godlygeek commented 1 year ago

Is there an existing proposal for this?

[X] I have searched the existing proposals

Is your feature request related to a problem?

We default to showing pure Python stacks, both for performance and for legibility to people who aren't systems programmers, but often these aren't very useful for figuring out exactly why a program had a hang or crash, since it's not always clear what might be happening inside of the current Python call. The solution to this is running with --native or --native-all mode, but these modes are both very verbose.

Describe the solution you'd like

@alicederyn has pitched an idea for me: it might be possible to eliminate some of the noise and performance implications of --native mode while still providing enough information to puzzle out deadlocks or blocking happening in the C code called from the user's Python code if have a mode that only shows the C frames after the last Python frame. That is, instead of our pure-Python stacks:

Traceback for thread 112 [] (most recent call last):
    (Python) File "/test.py", line 17, in <module>
        first_func()
    (Python) File "/test.py", line 6, in first_func
        second_func()
    (Python) File "/test.py", line 10, in second_func
        third_func()
    (Python) File "/test.py", line 14, in third_func
        time.sleep(1000)

And instead of the hybrid stacks that we show with --native:

Traceback for thread 112 [] (most recent call last):
    (C) File "???", line 0, in _start ()
    (C) File "???", line 0, in __libc_start_main ()
    (C) File "Modules/main.c", line 743, in Py_BytesMain (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/main.c", line 689, in Py_RunMain (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/main.c", line 610, in pymain_run_python (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/main.c", line 385, in pymain_run_file (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Python/pythonrun.c", line 472, in PyRun_SimpleFileExFlags (/usr/lib/libpython3.8.so.1.0)
    (C) File "Python/pythonrun.c", line 439, in pyrun_simple_file (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Python/pythonrun.c", line 1085, in pyrun_file (/usr/lib/libpython3.8.so.1.0)
    (C) File "Python/pythonrun.c", line 1188, in run_mod (/usr/lib/libpython3.8.so.1.0)
    (C) File "Python/pythonrun.c", line 1166, in run_eval_code_obj (/usr/lib/libpython3.8.so.1.0)
    (Python) File "/test.py", line 17, in <module>
        first_func()
    (C) File "Python/ceval.c", line 4963, in call_function (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Objects/call.c", line 284, in function_code_fastcall (inlined) (/usr/lib/libpython3.8.so.1.0)
    (Python) File "/test.py", line 6, in first_func
        second_func()
    (C) File "Python/ceval.c", line 4963, in call_function (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Objects/call.c", line 284, in function_code_fastcall (inlined) (/usr/lib/libpython3.8.so.1.0)
    (Python) File "/test.py", line 10, in second_func
        third_func()
    (C) File "Python/ceval.c", line 4963, in call_function (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Objects/call.c", line 284, in function_code_fastcall (inlined) (/usr/lib/libpython3.8.so.1.0)
    (Python) File "/test.py", line 14, in third_func
        time.sleep(1000)
    (C) File "Python/ceval.c", line 4963, in call_function (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/timemodule.c", line 338, in time_sleep (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/timemodule.c", line 1866, in pysleep (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "???", line 0, in __select ()

We'd show a stack that switches to showing C frames automatically after the last Python frame:

Traceback for thread 112 [] (most recent call last):
    (Python) File "/test.py", line 17, in <module>
        first_func()
    (Python) File "/test.py", line 6, in first_func
        second_func()
    (Python) File "/test.py", line 10, in second_func
        third_func()
    (Python) File "/test.py", line 14, in third_func
        time.sleep(1000)
    (C) File "Python/ceval.c", line 4963, in call_function (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/timemodule.c", line 338, in time_sleep (/usr/lib/libpython3.8.so.1.0)
    (C) File "Modules/timemodule.c", line 1866, in pysleep (inlined) (/usr/lib/libpython3.8.so.1.0)
    (C) File "???", line 0, in __select ()

This could be faster than running with --native, as we should be able to avoid unwinding and symbolizing the entire call stack and instead stop when we reach the first _PyEval_EvalFrameDefault call (though currently we do all the unwinding before any of the symbolizing, and that would need to change if we wanted to break out of unwinding early).

@pablogsal feels this would be a confusing default, but is willing to consider it as a --native=brief or --native-all=brief flag.

Alternatives you considered

No response

alicederyn commented 1 year ago

This may also be of interest: https://www.benfrederickson.com/profiling-native-python-extensions-with-py-spy/

Apologies if you've already seen this and implemented it or decided it's not a good fit 😅

pablogsal commented 1 year ago

This may also be of interest: benfrederickson.com/profiling-native-python-extensions-with-py-spy

Apologies if you've already seen this and implemented it or decided it's not a good fit 😅

Thanks for the link! I was already aware of this, and we used to do something similar in the first versions of pystack. On the other hand this information is a bit outdated (handling the native pthread id) because I changed this in CPython so we store thepthread id in the thread object so there is no need for dealing with registers (which is also very brittle). We have also a better way for older versions that rely on glibc internals but is actually much more resilient than that approach.

Also, in any case, all of that is already in pystack, what we are discussing in this issue if I understand correctly is just modifying the reporter under some option to just show the native stack under the Python stack.

Kretzer6019 commented 4 months ago

I would like to work in this issue

bloomberg / pystack