Closed emeryberger closed 4 years ago
I was surprised to not see https://github.com/vmprof/vmprof-python listed.
pyflame is also deprecated and archived, which is an important consideration
Thanks @gpshead for the pointer!
https://github.com/vmprof/vmprof-python
FWIW doesn't build on (my) Mac OS X (but who knows, could be a config error on my end; that said, it fails even inside a clean virtualenv
environment):
duplicate symbol '__PyThreadState_Current' in: build/temp.macosx-10.15-x86_64-3.7/src/_vmprof.o build/temp.macosx-10.15-x86_64-3.7/src/vmprof_common.o duplicate symbol '__PyThreadState_Current' in: build/temp.macosx-10.15-x86_64-3.7/src/_vmprof.o build/temp.macosx-10.15-x86_64-3.7/src/vmprof_unix.o ld: 2 duplicate symbols for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)
slower on Linux than scalene (~40% slower; NB this is running in Parallels)
no memory profiling
Comparision with https://github.com/joerick/pyinstrument (cpu only) would also be great. Probably scalene can also add a contextmanager to wrap some python code and run it inline in some large codebase / jupyter notebooks
https://github.com/joerick/pyinstrument:
And all of the above do not distinguish between time spent in Python vs. time spent in C (which Scalene now does).
Hi, maybe this is not the place of having this discussion but I have an outstanding question about how scalene is doing the profiling and how similar is this technique compared to other profilers.
Reading at the code seems that scalene is attributing a sample [1] to the specific line of code that was interrupted. Where an interval can be implicitly seen as a slice of 100ms of CPU. How fair is this assumption considering that other lines of code could be executed within that slice?
Most likely I'm missing something, but the way of scalene is doing the profiling is more about using statistics rather than instrumentalizing the code. So, instead of instrumenting everything and getting all of the elapsed times for all of the functions, Scalene is extrapolating the usage of the CPU by considering the number of times that this line of code has been interrupted. Am I wrong?
If this is true, I'm wondering how accurate is this profiling compared to other traditional tools like profile [2]. Wha would be in your opinion the main differences?
On the other way, the code claims [3] that due to the internals of CPython can not be possible to deliver a signal till the code path reaches again the byte-interpreter with that can be inferred the time that the sample was spent in the C extension. How does it work when signals are triggered during the byte-code execution and having in between calls to C functions?
[1] https://github.com/emeryberger/scalene/blob/master/scalene/scalene.py#L134 [2] https://docs.python.org/3/library/profile.html#module-profile [3] https://github.com/emeryberger/scalene/blob/master/scalene/scalene.py#L133
@pfreixes: Scalene is indeed a statistical profiler (https://en.wikipedia.org/wiki/Profiling_(computer_programming)#Statistical_profilers) and does not instrument code. This is mostly an advantage. Sampling can be both more accurate and faster than instrumentation.
Statistical profilers can be almost arbitrarily accurate, given enough samples (appropriately distributed and at a high enough frequency - for some mathematical background, see https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem and https://en.wikipedia.org/wiki/Margin_of_error#Calculations_assuming_random_sampling).
To track CPU usage, Scalene uses random sampling at a rate (currently) of one sample every hundred seconds (100Hz); its accuracy (like all sampling) increases with the square root of the number of samples taken. The longer your program runs, the more accurate Scalene gets.
Sampling is not only faster than instrumentation (as done by traditional profilers), which can slow down code considerably. It also has the advantage of avoiding the "probe effect", where the instrumentation introduces a form of bias that skews the results (so that the profiling results may not actually hold for the original program). By contrast, sampling is always testing the original program.
To answer your second question, the code now contains a detailed explanation of how Scalene attributes time to code (briefly, delays in the delivery of signals can only arise due to execution of C code outside the interpreter). See https://github.com/emeryberger/scalene/blob/master/scalene/scalene.py#L138.
@chiragjn: https://github.com/emeryberger/scalene/commit/a7afa197e5e3183e2ff9ba1e8ed36cd68bb8dce1 adds pyinstrument
and two variants of yappi
.
Added py-spy
(https://github.com/benfred/py-spy). Leaving off pyflame
since it's deprecated and unsupported. Leaving off vm-prof
since I can't get it to run on OS X.
https://github.com/benfred/py-spy
https://github.com/vpelletier/pprofile
https://pyflame.readthedocs.io/en/latest/installation.html