benfred / py-spy

Sampling profiler for Python programs
MIT License
12.13k stars 401 forks source link

Speed up --gil by skipping work for non-GIL threads #630

Closed wmanley closed 7 months ago

wmanley commented 7 months ago

When the user has specified --gil we needn't collect stack traces for the non-GIL threads. This is a significant speedup on heavily threaded Python code.

My server has 271 threads (most of which are idle).

With the previous code (sampling at 10Hz) I get:

$ ./py-spy record -o /var/opt/stbt/profile.svg --pid $(pgrep -f http_service.service) -r 10 -d 10 --gil
py-spy> Sampling process 10 times a second for 10 seconds. Press Control-C to exit.

py-spy> 1.29s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.36s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.32s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.22s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> Wrote flamegraph data to '/var/opt/stbt/profile.svg'. Samples: 3 Errors: 51

With this commit I can sample at 80Hz, and the error rate is way lower:

$ ./py-spy2 record -o /var/opt/stbt/profile.svg --pid $(pgrep -f http_service.service) -r 80 -d 10 --gil
py-spy> Sampling process 80 times a second for 10 seconds. Press Control-C to exit.

py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> Wrote flamegraph data to '/var/opt/stbt/profile.svg'. Samples: 43 Errors: 0

I suspect the difference in error rate is due to the error handling in _get_stack_traces - an error getting information about any thread means that all the traces are thrown away.