benfred / py-spy

Sampling profiler for Python programs
MIT License
12.16k stars 401 forks source link

How can I get raw data rather than percentage? #563

Open ottoSJTU opened 1 year ago

ottoSJTU commented 1 year ago

For example, I want to know exactly how much cpu time does a function cost(like what py-sby top shows), but not how the ratio it takes up among all functions(like what py-sby record -f raw shows) .

Jongy commented 1 year ago

py-spy is a sampling profiler and thus it does not know how much "time" functions take. Even py-spy top, when displaying percentages and/or "total time", does it only as a statistical computation ("if I sampled 100 times a second, and the function appeared in 57 of the samples, then it took 0.57 seconds"). Thus, logically, the same computation can be done based on the number of samples a function has in the flamegraph, divided by the frequency you used to record.

Remember that it's still statistical and if you'd want a truly accurate measure of "run time" you'd need to trace the function and sum the run times of each execution.

ottoSJTU commented 1 year ago

Thanks!

ottoSJTU commented 1 year ago

Another question : what does number at end of every line of the raw result file means? Is every line of the file a sample stack track? Could you please answer it

Jongy commented 1 year ago

Every line in the file is a unqiue sample stack. The collapsed format is very simple and stores the raw sample stacks, and applies only a simple "merging" of samples that repeat. The number in the end of every line is the number of repeats - so 1 means the stack repated one time, 875 means it repeated 875 times, and so on.