benfred / py-spy

Sampling profiler for Python programs
MIT License
12.16k stars 401 forks source link

py-spy thread blocks when locking the python thread #565

Open 1C4nfaN opened 1 year ago

1C4nfaN commented 1 year ago

When try to sample the python process with the following command py-spy record --pid {pid} --native -r 97 -d 10 -f raw -o xxx.txt

Sometimes py-spy process blocks for a long time with no results, I tried to analyze it, and I found that the thread was blocked in waitpid while locking the python thread

Thread 2 (LWP 3638147): #0 sccp () at ../src_musl/src/thread/__syscall_cp.c:11 #1 0x00007f39c9af010a in waitpid () at ../src_musl/src/process/waitpid.c:6 #2 0x00007f39c9945781 in remoteprocess::linux::Thread::lock::h744566c9e0fc13a0 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181 #3 0x00007f39c9944450 in remoteprocess::linux::Process::lock::h2fa7fa8cf9cf2ef8 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181 #4 0x00007f39c9830e71 in py_spy::python_spy::PythonSpy::_get_stack_traces::h6a885e6fcc0586bc () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181 #5 0x00007f39c9828ece in py_spy::python_spy::PythonSpy::get_stack_traces::h70c11f2d38dbd8c4 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181 #6 0x00007f39c98ce0d6 in std::sys_common::backtrace::__rust_begin_short_backtrace::h415d421592b1dec3 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181 #7 0x00007f39c98c0007 in core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h1748460e0434bbc0 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181 #8 0x00007f39c9aca533 in _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h0032718f1ee1e442 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872 #9 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h5bc95b757ddfa29b () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872 #10 std::sys::unix::thread::Thread::new::thread_start::h069db4c5e748eedf () at library/std/src/sys/unix/thread.rs:108 #11 0x00007f39c9af31a3 in start () at ../src_musl/src/thread/pthread_create.c:192 #12 0x00007f39c9af4214 in __clone () at ../src_musl/src/thread/x86_64/clone.s:22

When run with the environment variable RUST_LOG=debug, i got

[2023-04-11T06:57:29.556278705Z DEBUG remoteprocess::linux] attached to thread 432 [2023-04-11T06:57:29.556310223Z DEBUG remoteprocess::linux] attached to thread 460 [2023-04-11T06:57:29.556362082Z DEBUG remoteprocess::linux] attached to thread 479 [2023-04-11T06:57:29.556413127Z DEBUG remoteprocess::linux] attached to thread 481 [2023-04-11T06:57:29.556438563Z DEBUG remoteprocess::linux] attached to thread 509 [2023-04-11T06:57:29.556989723Z DEBUG remoteprocess::linux] detached from thread 432 [2023-04-11T06:57:29.557012099Z DEBUG remoteprocess::linux] detached from thread 460 [2023-04-11T06:57:29.557035824Z DEBUG remoteprocess::linux] detached from thread 479 [2023-04-11T06:57:29.557054071Z DEBUG remoteprocess::linux] detached from thread 481 [2023-04-11T06:57:29.557073439Z DEBUG remoteprocess::linux] detached from thread 509 [2023-04-11T06:57:29.562972706Z DEBUG remoteprocess::linux] attached to thread 432 [2023-04-11T06:57:29.563043296Z DEBUG remoteprocess::linux] attached to thread 460 [2023-04-11T06:57:29.563105838Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONT to 479 [2023-04-11T06:57:58.277320101Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCHLD to 479 [2023-04-11T07:01:29.290933962Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCHLD to 479 [2023-04-11T07:03:28.843141736Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCHLD to 479 [2023-04-11T07:03:58.990260667Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCHLD to 479 [2023-04-11T07:06:59.876781243Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCHLD to 479 [2023-04-11T07:15:00.296871553Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCHLD to 479

Looks like it's blocking while locking the python thread, but this doesn't always happen, sometimes I can successfully get the results.

I would like to know if there is a known reason or if there is any suggestion for me, thank you very much!

Jongy commented 1 year ago

The timestamps in your log show that py-spy was attached to 479 for more than 15 minutes which is way too much. This is probably a bug in py-spy.

As an immediate workaround I can suggest you use py-spy with --noblocking flag, which might make it a bit less accurate, but will prevent any side effects on the profiled application.

1C4nfaN commented 1 year ago

hi, @Jongy Thanks for your suggestion, but using the --nonblocking flag is not enough for what I want to get. If this is a bug, do you have any suggestions for us try to fix it on this case?

Jongy commented 1 year ago

If this is a bug, do you have any suggestions for us try to fix it on this case?

I spent a few minutes looking in the relevant code now.

impl ThreadLock {
    fn new(tid: nix::unistd::Pid) -> Result<ThreadLock, nix::Error> {
        ptrace::attach(tid)?;
        while let wait::WaitStatus::Stopped(_, sig) = wait::waitpid(tid, Some(wait::WaitPidFlag::WSTOPPED | wait::WaitPidFlag::__WALL))? {
            if sig == nix::sys::signal::Signal::SIGSTOP {
                break;
            }

            debug!("reinjecting non-SIGSTOP signal {} to {}", sig, tid);
            ptrace::cont(tid, sig)?;
        }

        debug!("attached to thread {}", tid);
        Ok(ThreadLock{tid})
    }
}

py-spy, per your stack trace, is blocked in this waitpid() call. I'm acquainted with this code because I actually wrote it 2 years back :sweat_smile: see here.

We are trying (in remoteprocess & py-spy) to PTRACE_ATTACH the process, then reinject non-SIGSTOP signals. The way I understand the semantics of PTRACE_ATTACH, a SIGSTOP must follow it. However what we're seeing here is - the thread 479 got numerous non-SIGSTOP signals over 17 minutes, but no SIGSTOP! I don't know how it can happen - that's the bug. What happens meanwhile is, that some of the threads are already PTRACE_ATTACHed (in this case, 432 and 460) so the process is in a bad state. Some of its threads are blocked unexpectedly.

I don't have any concrete idea how to solve it - I'll keep thinking and post here if I find anything. A workaround I can suggest is to add basic timeout to the logic. If we fail to lock all threads in, say, 500us * Nthreads, py-spy could bail on it, detach all threads it has already attached, and consider this sample an error. I don't suppose this problem reproduces consistently (as you said @1C4nfaN , "sometimes", and this seems like a tough race condition), so perhaps this solution will be enough.

To debug this further, I'd run alongside perf record -e signal:* and see the interaction - is a SIGSTOP generated? What else happens around that time?

@1C4nfaN if you manage to reproduce it consistently, please post here instructions to do so, it will greatly help me (or anyone else debugging) since this is very esoteric.

1C4nfaN commented 1 year ago

A workaround I can suggest is to add basic timeout to the logic. If we fail to lock all threads in, say, 500us * Nthreads, py-spy could bail on it, detach all threads it has already attached, and consider this sample an error.

Hi @Jongy , Thanks for your help!I think it's good advice to consider timeouts in this case, but in another test, i see that

23-04-11T04:02:05.106454671Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:05.267191159Z DEBUG remoteprocesSS: linux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:05.427691376Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:05.588338448Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:05.748893669Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:05.909496956Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:06.070129688Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:06.230674664Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:06.391133380Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:06.551662220Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:06.712171494Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:06.872638318Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:07.033214994Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONto 509 23-04-11T04:02:07.193885198Z DEBUG remoteprocess::linux| reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:07.3544982832 DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:07.515260753Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:07.675824288Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:07.836527054Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:07.997126241Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:08.157658180Z DEBUG remoteprocess: linux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:08.318285370Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:08.4789423522 DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:08.639499228Z DEBUG remoteprocess::1inux] reinjecting non-SIGSTOP signal SIGCONT to 509 23-04-11T04:02:08.800293246Z DEBUG remoteprocess::linux] reinjecting non-SIGSTOP signal SIGCONto 509

It's not blocking but keeps triggering waitpid and ptrace::cont in the while loop, really confused me...

I checked the python process sampled by py-spy, the pid is 432, 460/479/481/509 are its thread. What's interesting is that there is a process that I didn't pay attention to before that is monitoring 432's cpu usage, and may periodically call signal.kill(432, SIGSTOP)/signal.kill(432, SIGCONT) to 432, I think there might be a race condition here (although I really haven't figured out why)

Would like to ask your opinion on using PTRACE_SEIZE instead of PTRACE_ATTACH? I saw some blogs describing this usage but not really sure if this is valid...

If there are new discoveries we can discussed here. Thanks again ❤️

1C4nfaN commented 1 year ago

By the way, I have another simple error scenario here.

Below are the contents of /proc/{pid}/stat for two processes.

1981866 (python3) S 1981743 1981866 1981743 0 -1 1077936384 34709484 16140 105 0 72770 37613 11 9 20 0 17 0 372814758 4442140672 55046 18446744073709551615 94888328609792 94888328612516 140731140400576 0 0 0 0 16781312 17642 0 0 0 17 79 0 0 16 0 0 94888330710392 94888330711088 94888354955264 140731140409715 140731140409755 140731140409755 140731140415457 0

1984043 (probe(1981866)) S 1981866 1981866 1981743 0 -1 4194368 1324 0 0 0 169 99 0 0 20 0 3 0 372915682 4424044544 33799 18446744073709551615 94888328609792 94888328612516 140731140400576 0 0 0 0 16781312 17642 0 0 0 17 63 0 0 0 0 0 94888330710392 94888330711088 94888354955264 140731140409715 140731140409755 140731140409755 140731140415457 0

When we try to use py-spy dump --pid 1984043, then we get the following error Error: Failed to parse /proc/1984043/stat.

I think the problem should lie in the judgment of this code for the case of '))' (in remoteprocess)

fn get_active_status(stat: &[u8]) -> Option<u8> {
    // find the first ')' character, and return the active status
    // field which comes after it
    let mut iter = stat.iter().skip_while(|x| **x != b')');
    match (iter.next(), iter.next(), iter.next()) {
        (Some(b')'), Some(b' '), ret) => ret.map(|x| *x),
        _ => None
    }
}

I thought I'd make a note of the issue here, though I'm not very familiar with the code that actually executes in process 1984043, I will try to check this and if necessary I'd willing to fix it if I had time.

Jongy commented 1 year ago

It's not blocking but keeps triggering waitpid and ptrace::cont in the while loop, really confused me..

That's the same scenario. What you put here is ust another manifestation of the bug.

I'll elaborate more to what I wrote in my previous comment - py-spy attempts to lock all e.g 6 threads of your Python process. It successfully locks 3 and on the 4th, due to a bug, it remains forever in the "reinejcting non-SIGSTOP" loop for thread 4. Meanwhile - threads 5 6 are not locked yet, so they are running "fine", but threads 1 2 3 are already locked - PTRACE_ATTACHed.

What's interesting is that there is a process that I didn't pay attention to before that is monitoring 432's cpu usage, and may periodically call signal.kill(432, SIGSTOP)/signal.kill(432, SIGCONT) to 432, I think there might be a race condition here (although I really haven't figured out why)

This definitely might be related. I have some idea in mind... will need to try it out - perhaps, if the process got a SIGSTOP (and when a process gets a signal, any thread in it might get it randomly, but SIGSTOP behaves differently because it's handled in the kernel) and then py-spy tries to PTRACE_ATTACH while the thread is already in STOPPED state, the ptrace-induced SIGSTOP is never generated and py-spy waits in vain.

I might be able to try it out later today, will write here if so.

Would like to ask your opinion on using PTRACE_SEIZE instead of PTRACE_ATTACH? I saw some blogs describing this usage but not really sure if this is valid...

PTRACE_SEIZE does not stop the process - this is the equivalent of passing --nonblocking.

When we try to use py-spy dump --pid 1984043, then we get the following error Error: Failed to parse /proc/1984043/stat.

This is a separate problem indeed, it seems that the function doesn't cope with process comms that have ) in them. Not a very common case for Python processes, but I suppose no reason not to support it properly. The logic needs to be "parse from the first ( until the matching )" or maybe just "parse the last )" instead of "the first )". Can you create a separate issue for that?

1C4nfaN commented 1 year ago

I might be able to try it out later today, will write here if so.

Thank you so much for your time~

PTRACE_SEIZE does not stop the process - this is the equivalent of passing --nonblocking.

That's right, I forgot to describe earlier I referenced the following statement from ptrace

"Since Linux 3.4, PTRACE_SEIZE can be used instead of PTRACE_ATTACH. PTRACE_SEIZE does not stop the attached process. If you need to stop it after attach (or at any other time) without sending it any signals, use PTRACE_INTERRUPT command."

so maybe PTRACE_SEIZE+PTRACE_INTERRUPT could instead of PTRACE_ATTACH?

This is a separate problem indeed, it seems that the function doesn't cope with process comms that have ) in them. Not a very common case for Python processes, but I suppose no reason not to support it properly. The logic needs to be "parse from the first ( until the matching )" or maybe just "parse the last )" instead of "the first )". Can you create a separate issue for that?

Sure, i'll create a separate issue.

1C4nfaN commented 1 year ago

When py-spy run with py-spy record --pid {pid} --native -r 97 -d 10 -f raw -o xxx.txt, I try to printing the stack, as below

Thread 3 (LWP 805052):
#0  sccp () at ../src_musl/src/thread/__syscall_cp.c:11
#1  0x00007f72dcd2359f in read () at ../src_musl/src/unistd/read.c:6
#2  0x00007f72dcc23552 in nix::unistd::read::h0aa062cf7d2053b6 ()
#3  0x00007f72dcafe652 in std::sys_common::backtrace::__rust_begin_short_backtrace::hcc9258a46d3c1d8f ()
#4  0x00007f72dcaef386 in core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h991851875476ab0b ()
#5  0x00007f72dccf9533 in _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h0032718f1ee1e442 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
#6  _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h5bc95b757ddfa29b () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
#7  std::sys::unix::thread::Thread::new::thread_start::h069db4c5e748eedf () at library/std/src/sys/unix/thread.rs:108
#8  0x00007f72dcd221a3 in start () at ../src_musl/src/thread/pthread_create.c:192
#9  0x00007f72dcd23214 in __clone () at ../src_musl/src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC

Thread 2 (LWP 805051):
#0  sccp () at ../src_musl/src/thread/__syscall_cp.c:11
#1  0x00007f72dcd2359f in read () at ../src_musl/src/unistd/read.c:6
#2  0x00007f72dcb7b4b7 in maps_next () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#3  0x00007f72dcb7b81b in _Ux86_64_get_elf_image () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#4  0x00007f72dcb7a9a8 in get_unwind_info () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#5  0x00007f72dcb7ab00 in _UPT_find_proc_info () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#6  0x00007f72dcb80359 in fetch_proc_info () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#7  0x00007f72dcb81740 in find_reg_state () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#8  0x00007f72dcb819bc in _Ux86_64_dwarf_step () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#9  0x00007f72dcb7cf5f in _Ux86_64_step () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#10 0x00007f72dcb627ed in _$LT$remoteprocess..linux..libunwind..Cursor$u20$as$u20$core..iter..traits..iterator..Iterator$GT$::next::h8d6ba6e958cd1c39 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#11 0x00007f72dca69779 in py_spy::python_spy::PythonSpy::_get_pthread_id::h58fdef68ab857262 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#12 0x00007f72dca6648b in py_spy::python_spy::PythonSpy::_get_os_thread_id::h807756e474dc43d7 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#13 0x00007f72dca603b4 in py_spy::python_spy::PythonSpy::_get_stack_traces::h6a885e6fcc0586bc () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#14 0x00007f72dca57ece in py_spy::python_spy::PythonSpy::get_stack_traces::h70c11f2d38dbd8c4 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#15 0x00007f72dcafd0d6 in std::sys_common::backtrace::__rust_begin_short_backtrace::h415d421592b1dec3 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#16 0x00007f72dcaef007 in core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h1748460e0434bbc0 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#17 0x00007f72dccf9533 in _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h0032718f1ee1e442 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
#18 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h5bc95b757ddfa29b () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
#19 std::sys::unix::thread::Thread::new::thread_start::h069db4c5e748eedf () at library/std/src/sys/unix/thread.rs:108
#20 0x00007f72dcd221a3 in start () at ../src_musl/src/thread/pthread_create.c:192
#21 0x00007f72dcd23214 in __clone () at ../src_musl/src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC

Thread 1 (LWP 805050):
#0  0x00007f72dcd1eb7e in __syscall6 () at ../src_musl/arch/x86_64/syscall_arch.h:59
#1  syscall () at ../src_musl/src/misc/syscall.c:20
#2  0x00007f72dccf6775 in std::sys::unix::futex::futex_wait::h65564c44ef71c5aa () at library/std/src/sys/unix/futex.rs:61
#3  0x00007f72dcce98b9 in std::sys_common::thread_parker::futex::Parker::park::h1857a2d51b91109b () at library/std/src/sys_common/thread_parker/futex.rs:52
#4  std::thread::park::he5cdd1e0067c0814 () at library/std/src/thread/mod.rs:929
#5  0x00007f72dccf22a2 in std::sync::mpsc::blocking::WaitToken::wait::h7da894c0826b7e66 () at library/std/src/sync/mpsc/blocking.rs:67
#6  0x00007f72dca6df78 in std::sync::mpsc::stream::Packet$LT$T$GT$::recv::h6d1e110c42068000 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#7  0x00007f72dcb014af in std::sync::mpsc::Receiver$LT$T$GT$::recv::hb203c62da94a10a9 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#8  0x00007f72dcb0d05f in py_spy::run_spy_command::h059726825bd45700 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#9  0x00007f72dcb0ed14 in py_spy::main::h523879457f122e9a () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#10 0x00007f72dcafe5d3 in std::sys_common::backtrace::__rust_begin_short_backtrace::h961c1afc5ca03793 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#11 0x00007f72dca7e209 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hd9822804426d1ce6 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181
#12 0x00007f72dcce8fa1 in core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h18fd15a330c116b9 () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/ops/function.rs:280
#13 std::panicking::try::do_call::he918a328b783a420 () at library/std/src/panicking.rs:492
#14 std::panicking::try::h88ecc043ee71c652 () at library/std/src/panicking.rs:456
#15 std::panic::catch_unwind::h074c8fe23976f009 () at library/std/src/panic.rs:137
#16 std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::he26a080dffc4764a () at library/std/src/rt.rs:128
#17 std::panicking::try::do_call::hf9be9f327f382512 () at library/std/src/panicking.rs:492
#18 std::panicking::try::h163c4b2413cf9165 () at library/std/src/panicking.rs:456
#19 std::panic::catch_unwind::h58a40b305e945d4e () at library/std/src/panic.rs:137
#20 std::rt::lang_start_internal::h183e60933887f95a () at library/std/src/rt.rs:128
#21 0x00007f72dcb0fb02 in main () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:181

In my understanding, Thread1 wait for recv to get stacks from Thread2, Thread2 in maps_next to get native stack, but what I don't quite understand is where thread3 is spawned and what is it doing, could you guide me some details? If I think the delay of record samples is pretty serious, that may be from Thread2? Thanks~