Granulate / gprofiler

gProfiler is a system-wide profiler, combining multiple sampling profilers to produce unified visualization of what your CPU is spending time on.
https://profiler.granulate.io
Apache License 2.0
754 stars 54 forks source link

python: ebpf: _terminate() blocking #744

Closed Jongy closed 1 month ago

Jongy commented 1 year ago

I got this dump from a test deployment of a gProfiler 1.19.0, which became stuck.

Thread 16815 (idle): "ThreadPoolExecutor-0_3"
    _try_wait (subprocess.py:1901)
        Arguments:
            self: <Popen at 0x7f92f05ccd60>
            wait_flags: 0
    _wait (subprocess.py:1943)
        Arguments:
            self: <Popen at 0x7f92f05ccd60>
            timeout: None
    wait (subprocess.py:1209)
        Arguments:
            self: <Popen at 0x7f92f05ccd60>
            timeout: None
    _terminate (python_ebpf.py:254)
        Arguments:
            self: <PythonEbpfProfiler at 0x7f92f8b0b010>
        Locals:
            code: None
    _dump (python_ebpf.py:221)
        Arguments:
            self: <PythonEbpfProfiler at 0x7f92f8b0b010>
        Locals:
            process: <Popen at 0x7f92f05ccd60>
    snapshot (python_ebpf.py:227)
        Arguments:
            self: <PythonEbpfProfiler at 0x7f92f8b0b010>
    snapshot (python.py:403)
        Arguments:
            self: <PythonProfiler at 0x7f92f8b0add0>
    run (thread.py:58)
        Arguments:
            self: <_WorkItem at 0x7f92f028aa10>
    _worker (thread.py:83)
        Arguments:
            executor_reference: <weakref.ReferenceType at 0x7f92f8b97fb0>
            work_queue: <_queue.SimpleQueue at 0x7f92f8b1f830>
            initializer: None
            initargs: ()
        Locals:
            work_item: <_WorkItem at 0x7f92f028aa10>

gProfiler tried to stop PyPerf, it sent SIGTERM and PyPerf indeed exited with SIGTERM. However, gProfiler remains blocked in the wait function. Not sure if it's a bug in python stdlib/gprofiler/kernel lol. Why would wait block if PyPerf truly exited?

Jongy commented 1 year ago

This is probably it: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.wait

Jongy commented 1 year ago

Reopening because I'm not sure it's fixed.