Pure question: about py-spy sampling method

I am currently developing a cpu profiler that adapts to multiple languages. In the python part, I expect to use py-spy, but when I am adapting, I find that the profile results of py-spy are strongly related to the acquisition frequency. For example, set the frequency to At 100, the final call stack count is close to 1000 in the 10-second sampling period, and I directly call the rust underlying interface implementation, which is almost the same result (pyspy_snapshot). I understand that the working principle of py-spy is to read the python process memory to get the stack, which is different from other samplers that use the perf_event event of cpu_clock at the bottom (bpf:bcc-profiler, java:async-profiler), so when a When the container has multiple processes in multiple programming languages, using these different samplers to sample at the same time and summarize the results into a flame graph, the results of py-spy do not seem to be very accurate (because the other samplers are in 10 Some call stacks may appear at most 10 times in a second, while py-spy has 1000 times at high frequency, which is unusually wide), because it is not based on perf-event, but based on memory, regardless of whether the python process is occupied or not. cpu. The above is just my personal understanding, I want to know if it is correct, how should I deal with this problem?

benfred / py-spy

Pure question: about py-spy sampling method #506